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NOVEL MOUSE POLYPEPTmES ENCODED BY 
POLYNUCLEOTTOES AND METHODS OF THEIR USE 

Priority Claim 

[00 1] This application is related to the following provisional applications 
filed in the United States Patent and Trademark Office, the disclosures of which are 
hereby incorporated by reference: 



Application 
Number 


Title 


FiUng Date 


60/476,632 


Novel Mouse Polynucleotides Relating to Kinases, 
Phosphatases, and Proteases 


June 9, 2003 


60/476,621 


Methods of Use for Novel Mouse Polynucleotides 
Relating to Kinases, Phosphatases, and Proteases 


June 9, 2003 


60/485,539 


Novel Mouse Polynucleotides Relating to Secreted and 
Transmembrane Proteins 


July 8, 2003 


60/485,217 


Methods of Use for Novel Mouse Polynucleotides 
Relating to Secreted and Transmembrane Proteins 


July 8, 2003 



Technical FIELD 
[002] The present invention is related generally to novel 

polynucleotides and novel polypeptides encoded thereby, their compositions, 
antibodies directed thereto, and other agonists or antagonists thereto. The 
polynucleotides and polypeptides are useful in diagnostic, prophylactic, and 
therapeutic applications for a variety of diseases, disorders, syndromes and 
conditions, as well as in discovering new diagnostics, prophylactics, and therapeutics 
for such diseases, disorders, syndromes, and conditions (hereinaftei; disorders). The 
present invention also relates to methods of modulating biological activities through 
the use of the novel polynucleotides and novel polypeptides of the invention and 
through the use of agonists and antagonists, such as antibodies, thereto. 

[003] This application further relates to the field of polypeptides that 

are associated with regulating cell growth and differentiation, that are over-expressed 
in cancer, and/or that can be associated with proliferation or inhibition of cancer 
growth, including hematopoietic cancers such as leukemias, lymphomas, and solid 
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cancers such as lung cancer, for example, adenocarcinomas and/or squamous cell 
carcinomas. These polypeptides may also be associated with other conditions, such as 
inflammatory, immune, and metabolic disorders, as well as microbial infections, 
including viral, bacterial, fungal, and parasitic diseases, disorders, syndromes, or 
conditions. 

[004] This application further relates to modulators of biological 

activity that can specifically bind to these polynucleotides or polypeptides, or 
otherwise specifically modulate thek activity. For example, they can directly or 
indirectly induce antibody-dependent cellular cytotoxicity (ADCC), complement- 
dependent cytotoxicity (CDC), endocytosis, apoptosis, or recruitment of other cells to 
effect cell activation, cell inactivation, cell growth or differentiation or inhibition 
thereof, and cell killing. 

[005] The sequences of the invention encompass a variety of diffetent 

types of nucleic acids and polypeptides with different structures and functions. They 
can encode or comprise polypeptides belonging to different protein families ("Pfam"). 
The "Pfam" system is an organization of protein sequence classification and analysis, 
based on conserved protein domains; it can be publicly accessed in a number of ways, 
for example, at http://pfam.wustl.edu. Protein domains are portions of proteins that 
have a tertiary structure and sometimes have enzymatic or binding activities; multiple 
domains can be connected by flexible polypeptide regions within a protein. Pfam 
domains can comprise the N-terminus or the C-terminus of a protein, or can be 
situated at any point in between. The Pfam system identifies protein families based 
on these domains and provides an annotated, searchable database that classifies 
proteins into families (Bateman et al., 2002). 

[006] Sequences of the invention can encode or be comprised of more 

than one Pfam. Sequences encompassed by the invention include, but are not limited 
to, the polypeptide and polynucleotide sequences of the molecules shown in the 
Sequence Listing and corresponding molecular sequences found at all developmental 
stages of an organism. Sequences of the invention can comprise genes or gene 
segments designated by the Sequence Listing, and their gene products, i.e., RNA and 
polypeptides. They also include variants of those presented in the Sequence Listing 
that are present in the normal physiological state, e.g., variant alleles such as SNPs, 
spUce variants, as well as variants that are affected in pathological states, such as 
disease-related mutations or sequences with alterations that lead to patiiology, and 
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variants with conservative amino acid changes. Sequences of the invention are 
categorized below; any given sequence can belong to one or more than one category. 
Secreted Protein-Related Sequences 

[007] Secreted proteins, also referred to as secreted factors, include proteins 
that are produced by cells and exported extracellularly, extracellular fragments of 
transmembrane proteins that are proteolytically cleaved, and extracellular fragments 
of cell surface receptors, which fragments may be soluble. An example of a secreted 
protein is keratinocyte growth factor (KGF), which stimulates tiie growth of 
keratinocytes, and is useful for repairing tissue after chemotherapy or radiotherapy. 

[008] Many and widely variant biological functions are mediated by a wide 
variety of different types of secreted proteins. Yet, despite the sequencing of the 
human genome, relatively few pharmaceutically useful secreted proteins have been 
identified. It would be advantageous to discover novel secreted proteins or 
polypeptides, and their corresponding polynucleotides that have medical utility. 

[009] Pharmaceutically useful secreted proteins of the present invention 
will have in common the ability to act as ligands for binding to receptors on cell 
surfaces in ligand/receptor interactions, to trigger certain intiracellular responses, such 
as inducing signal transduction to activate cells or inhibit cellular activity, to induce 
cellular growth, proliferation, or differentiation, or to induce the production of other 
factors that, in turn, mediate such activities. 

[010] The cell types having cell surface receptors responsive to secreted 
proteins are various, including, for example, stem cells; progenitor cells; and 
precursor cells and mature cells of the hematopoietic, hepatic, neural, lung, heart, 
thymic, splenic, epithehal, pancreatic, adipose, gastrointestinal, colonic, optic, 
olfactory, bone and musculoskeletal lineages. Further, tiie hematopoietic cells can be 
red blood cells or white blood cells, including cells of the B lymphocytic (B cell), T 
lymphocytic (T cell), dendritic, megakaryocytic, natiiral killer (NK), macrophagic, 
eosinophiUc, and basophilic lineages. The cell types responsive to secreted proteins 
also include normal cells or cells implicated in disease, disorders, syndromes, or other 
pathological conditions. 

[Oil] As an example, certain of the secreted proteins of the present 
invention can stimulate T or B cell growth or differentiation by interacting with 
precursor T or B cells or hematopoietic progenitor cells, or bone marrow stem cells. 
As another example, certain secreted proteins of the present invention can maintain 
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stem cells, progenitor cells or precursor cells in an undifferentiated state. As a further 
example, certain secreted proteins of the present invention can regulate bone growth 
by stimulation or inhibition thereof, secretion of insulin, glucose metabolism, cell 
proliferation, response to microbial infection, and regeneration of tissues including 
neural, muscular, and epitliehal. Moreover, certain secreted proteins of the present 
invention can induce apoptosis such as in cancer cells or inflammatory cells. 

[012] Certain of the secreted proteins of the present invention are useful for 
diagnosis, prophylaxis, or treatment of disorders in subjects that are deficient in such 
secreted proteins or require regeneration of certain tissues, the proliferation of which 
is dependent on such secreted proteins, or requires an inhibition or activation of 
growth that is dependent on such secreted proteins. Examples of such disorders 
include cancer, such as bone cancer, brain tumors, breast and ovarian cancer, Burkitt's 
lymphoma, chronic myeloid leukemia, colon cancer, endocrine system cancers, 
gastrointestinal cancers, gynecological cancers, head and neck cancers, leukemia, 
lung cancer, lymphomas, malignant melanoma, metastases, multiple endocrine 
neoplasia, myelomas, neurofibromatosis, pancreatic cancer, pediatric cancers, penile 
cancer, prostate cancer, disorders related to the Ras oncogene, retinoblastoma (RB), 
sarcomas, skin cancers, testicular cancer, thyroid cancer, urinary tract cancers, and 
von Hippel-Lindau syndrome. 

[013] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of hematopoeisis, including thrombosis; 
bleeding; anemias, e.g., iron deficiency and other hypoproliferative anemias, 
megaloblastic anemias, hemolytic anemias, acute blood loss, and aplastic anemia; 
hemoglobinopathies; disorders of granulocytes and monocytes; myelodysplasias and 
related bone marrow failure syndromes; polycythemias, e.g., polycythemia vera; acute 
and chronic myeloid leukemia, and other myeloproliferative diseases, e.g., 
malignancies of lymphoid cells; stimulation of replacement cell growth following 
irradiation or chemotherapy; and plasma cell disorders. 

[014] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of hemostasis, such as disorders of the platelet 
and vessel wall, disorders of coagulation and thrombosis, and anticoagulant, 
fibrinolytic and antiplatelet therapies. 

[015] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the cardiovascular system including 
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disorders of the heart, such as heart failure; congenital heart disease; rheumatic fever; 
cor pulmonale; cardiomyopathies e.g., myocarditis; pericardial disease; cardiac 
tumors; cardiac manifestations of systemic diseases; and vascular diseases, such as 
acute myocardial infarction, ischemic heart disease, hypertensive vascular disease, 
diseases of the aorta, and vascular diseases of the extremities. 

[01 6] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the respiratory system, such as asthma, 
hypersensitivity pneumonitis, e.g., with pulmonary infiltration, pneumonia, 
necrotizing pulmonary infections, bronchiectasis, cystic fibrosis, chronic bronchitis, 
emphysema and airway obstruction, interstitial lung diseases, primary puhnonary 
hypertension, pulmonary thromboembolism, disorders of the pleura, mediastinum, 
and diaphragm, disorders of ventilation, sleep apnea, and acute respiratory distress 
syndrome, 

[017] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the kidney and urinary tract, such as, for 
example, chronic renal failure and glomerulopathies. 

[0 1 8] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the gastrointestinal system, including 
disorders of the alimentary tract, such as, for example, peptic ulcer disease and related 
disorders, inflammatory bowel disease, irritable bowel syndrome; disorders of the 
liver and biliary tract, such as, for example, hyperbilirubinemias, acute viral hepatitis, 
chronic hepatitis, and cirrhosis; and disorders of the pancreas, such as acute or chronic 
pancreatitis. 

[01 9] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the immune system, connective tissue, and 
joints, including, for example, autoimmune diseases, primary immune deficiency 
diseases, human immunodeliciency virus diseases, allergies, systemic lupus 
erythematosus, rheumatoid arthritis, systemic sclerosis, Sjogren's syndrome, 
ankylosing spondylitis, reactive arthritis, vasculitis, sarcoidosis, amyloidosis, 
osteoarthritis, gout, psoriatic, and other arthritis. 

[020] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the endocrine system, including, for 
example, disorders of the pituitary, hypothalamus, neurohypophysis, thyroid gland, 
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adrenal cortex, testes, ovary, and other organs of the female reproductive system, such 
as breast; as well as pheochromocytoma, diabetes mellitus, and hypoglycemia. 

[02 1 ] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of bone and mineral metabolism, and other 
metabolic processes, including, for example, diseases of the parathyroid gland and 
other hyper- and hypocalcemic disorders, osteoporosis, Paget's disease and other 
dysplasia of bone, disorders of lipoprotein metabolism, hemochromatosis, porphyries, 
disorders of purine and pyrimidine metabolism, Wilson's disease, lysosomal storage 
diseases, glycogen storage diseases, lipodystrophies, and other primary disorders of 
adipose tissue. 

[022] Certain ofthe secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the central nervous system, including, for 
example, seizures and epilepsy, cerebrovascular diseases, Alzheimer^ disease and 
other extrapyramidal disorders, ataxic disorders, amylotrophic lateral sclerosis and 
other motor neuron diseases, disorders of the autonomic nervous system, diseases of 
the spinal cord, including spinal cord injury, primary and metastatic tumors ofthe 
nervous system, multiple sclerosis, and other demyelinating diseases, as well as 
chronic and recurrent meningitis. 

[023] Certain ofthe secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of nerves or muscle, including, for example, 
Guillain-Barre Syndrome, myasthenia gravis and other diseases of the neuromuscular 
junction, polymyositis, dennatomyositis, muscular dystrophies, and other muscle 
diseases. 

[024] Certain of the secreted proteins herein can be used for diagnosis, 
prophylaxis, and treatment of disorders of the skin, including, for example, eczema, 
psoriasis, cutaneous infections, acne, and other common skin disorders, and 
immunologically mediated skin diseases. 

[025] The agonists or antagonists of the secreted proteins herein or 
fragments thereof can be useftil in treating elevated levels of such proteins in ny ofthe 
disorders above, and including angina, anoxia, arrhytlimias, asthma, atherosclerosis, 
benign prostatic hyperplasia, Buerger's Disease, cardiac arrest, cardiogenic shock, 
cerebral trauma, Crohn's Disease, congenital heart disease, mild congestive heart 
failure (CHF), severe congestive heart failure, cerebral ischemia, cerebral infarction, 
cerebral vasospasm, cirrhosis, diabetes, dilated cardiomyopathy, endotoxic shock, 
6 
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gastric mucosal damage, glaucoma, head injur}^ hemodialysis, hemorrhagic shock, 
hypertension (essential), hypertension (malignant), hypertension (pulmonary), 
hypertension (e.g., pulmonary, after bypass), hypoglycemia, inflammatory arthritis, 
ischemic bowel disease, ischeniic disease, male penile erectile dysfunction, malignant 
hemangioendothelioma, myocardial infarction, myocardial ischemia, prenatal 
asphyxia, postoperative cardiac surgery, prostate cancer, preeclampsia, Raynaud's 
Phenomenon, renal failure (acute), renal failure (chronic), renal ischemia, restenosis, 
sepsis syndrome, subarachnoid hemorrhage (acute), surgical operatioi^, status 
epilepticus, stroke (thromboembolic), stroke (hemorrhagic), Takayasu^ arteritis, 
ulcerative colitis, uremia after hemodialysis, and uremia before hemodialysis. 

[026]. Secreted proteins can be screened for functional activities in 
appropriate functional assays, as is conventional in the art. Siich assays include, for 
example, in vitro and in vivo assays for factors that stimulate the proliferation or 
differentiation of stem cells, progenitor cells, or precursor cells into T cells, B cells, 
pancreatic islet cells, bone cells, neuronal cells, etc. 

[027] The tetratricopeptide repeat (TPR) is an example of a protein domain 
characteristic of a protein family, and is present in some of the secreted polypeptides 
of the invention. The TPR family is characterized by a degenerate 34 amino acid 
sequence present in a wide variety of proteins; it mediates protein-protein interactions, 
and is involved in scaffold formation and the assembly of multiprotein complexes 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=TPR). Secreted protein-related 
sequences can also possess or interact with cytochrome P450 domains, which are 
involved in the oxidative degradation of various compoxmds, including environmental 
toxins and mutagens (http://pfam.wustl.edu/cgi-bin/getdesc?name=p450). Secreted 
protein-related sequences, e.g., cholesteryl ester transfer protein and phospholipid 
transfer protein, can also possess or interact with the LBP/BPI/CETP domain, which 
is characteristically found in lipid-binding serum glycoproteins (http://pfem.wustl. 
edu/cgi-bin/getdesc?name=LBP_BPI_CETP). Secreted protein-related sequences can 
also possess or interact with peptidase S8 domains, also known as subtilase domains, 
which are comprised of serine proteases with a wide range of peptidase activities, 
including exopeptidase, endopeptidase, oligopeptidase, and omega-peptidase activity 
(http ://pfam. wustl .edu/cgi-binygetdesc?name=Peptidase_S 8) . . Secreted protein- 
related sequences can also possess or interact with adh_short, or short-chain 
dehydrogenase domains, which are found in a large family of proteins, and are made 
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Up of short-chain dehydrogenases and reductase enzymes; most family members 
function as NAD- or NADP- dependent oxidoreductases (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=adh_short) . 

[028] The inventors herein have identified novel secreted proteins using an 
algorithm that is constructed on the basis of a number of attributes including 
hydrophobicity, two-dimensional structure, prediction of signal sequence cleavage 
site, and other parameters. Based on such algorithm, a sequence that has a secreted 
tree vote of 0.5 - 1 .0, preferably, 0.6 - 1 .0, is believed to be a secreted protein. 
Transmembrane Protein-Related Sequences 

[029] Transmembrane proteins extend into or through the cell membrane's 
lipid bilayer; they can span the membrane once, or more than once. Transmembrane 
proteins that span the membrane once are "single transmembrane proteins" (STM), 
and transmembrane proteins that span the membrane more than once are "multiple 
transmembrane proteins" (MTM). Examples of transmembrane proteins include the 
insulin receptor, adenylate cyclase, and intestinal brush border esterase. 

[030] A single transmembrane protein typically has one transmembrane 
(TM) domain, spaiming a series of consecutive amino acid residues, numbered on the 
basis of distance from the N-terminus, with the first amino acid residue at the N- 
terminus as number 1 . A multi-transmembrane protein typically has more than one 
TM domain, each spanning a series of consecutive amino acid residues, numbered in 
the same way as the STM protein. 

[03 1] Transmembrane proteins, having part of their molecules on either 
side of the bilayers, have many and widely variant biological functions. They 
transport molecules, e.g., ions or proteins across membranes, transduce signals across 
membranes, act as receptors, and function as antigens. Transmembrane proteins are 
often involved in cell signaling events; they can comprise signaling molecules, or can 
interact with signaling molecules. For example, tyrosine kinases can be 
transmembrane receptor proteins. Abnormalities of receptor tyrosine kinases are 
associated with human cancers; tumor cells are known to use receptor tyrosine kinases 
in transduction pathways to achieve tumor growth, angiogenesis and metastasis. 
Therefore, receptor tyrosine kinases represent pivotal targets in cancer therapy. It 
would be similarly advantageous to discover novel transmembrane proteins or 
polypeptides, and then* corresponding polynucleotides that have additional medical 
utility. 

8 
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[032] The transmembrane polypeptides of the invention, like the secreted 
polypeptides, also have many different functional domains, and belong to a wide 
variety of Pfam families. Transmembrane protein-related sequences can possess or 
interact with immunoglobulin (ig) domains, which are characteristically found in the 
immunoglobulin superfamily, comprised of hundreds of proteins, with various 
functions (http://pfam.wustl.edu/cgi-bin/getdesc?name=ig). Transmembrane protein- 
related sequences can also possess or interact with ion_tTans domains, which are 
polypeptides characterized by six transmembrane helices, and which transport ions 
across membranes (http://pfam.wustl.edu/cgi-bin/getdesc?name=ion_trans). Proteins 
in this femily can demonstrate specificity for particular ions, e.g., sodium, potassium, 
and calcium. Transmembrane protein-related sequences can also possess or interact 
wifhi integrase core domains, which mediate the integration of a DNA copy of a viral 
genome into a host chromosome; e.g., HIV integrase catalyses the incorporation of 
virally derived DNA into the human genome, presenting a target for the development 
of new therapeutics for the treatment of AIDS (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=rve). Transmembrane protein-related sequences can also possess 
or interact with domains designated as differentially expressed in neoplastic vs. 
normal cells "DENN" domains, which are involved in signal transduction. 
Characteristically, these domains are found in protein components of signaling 
pathways that utilize rab proteins or mitogen-activated protein (MAP) kinases 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=DENN). 

[033] Transmembrane protein-related sequences can also possess or interact 
with acyl coA binding protein (ACBP) domains, which are protein domains that bind 
medium- and long-chain acyl-CoA esters with higli affinity (http://pfam.wustl.eduy 
cgi-bin/getdesc?naine=ACBP). Membrane-related sequences also possess or interact 
with SPFH domain^and 7 family (Band ?) domain, which are protein domains that 
include a transmembrane segment, and regulate cation conductivity 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Band_7). 

[034] Transmembrane proteins that are differentially expressed on tlie 
surface of cancer cells, particularly those that are differentially expressed on the 
surface of cancer cells but not on the surface of normal tissues, such as heart and lung, 
are desirable targets for production of antibodies, e.g., diagnostic antibodies or 
therapeutic antibodies, such as antibodies that mediate ADCC or CDC to effect tumor 
cell killing. 

9 
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[035] Transmembrane proteins with extracellular fragments that can be 
cleaved can be useful as secreted proteins to effect ligand/receptor binding so as to 
mediate intracellular responses, such as signal transduction. Transmembrane proteins 
that act as receptors, and possess a ligand binding extracellular portion exposed on a 
cell surface and an intracellular portion that interacts with other cellular components 
upon activation can be also be useful as transmembrane proteins to mediate 
intracellular responses, such as signal transduction. 
Kinase-Related Sequences 

[036] A kinase is an enzyme that catalyzes the transfer of phosphate groups 
from phosphate donors to acceptor substrates. Kinase substrates include, but are not 
limited to, proteins and lipids. Sequences of the invention that phosphorylate protein 
substrates are designated "Pkinases." Examples of kihase-related sequences include 
calcium, calmodulin-dependent protein kinase II, myosin Ught chain kinase, and 
phosphatidlyinositol kinase. 

[037] Kinases and phosphatases are counteracting: kinases add phosphate 
groups and phosphatases liberate phosphate groups. The counteracting activities of 
kinases and phosphatases provide cells with a "switch" that can turn on or turn off the 
function of various proteins. The activity of any protein regulated by phosphorylation 
depends on the balance, at any given time, between the activities of the kinase(s) that 
phosphorylate it, and the phosphatase(s) that dephosphorylate it. Phosphorylation 
plays a important role in intercellular communication during development, 
homeostasis, and the function of major bodily systems, including the immune system. 

[03 8] In conjunction with phosphatases, kinases control such diverse and 
essential cellular processes as transcription, cell division, cell cycle progression, 
differentiation, cytoskeletal flinction, apoptosis, receptor flmction, learning and 
memory, hematopoeisis, fertilization, neural transmission, muscle contraction, non- 
muscle motor function, glycogen metabohsm, and hormone secretion. 

[039] Most kinases act within a network of kinases and other signaling 
effectors, and are modulated by autophosphorylation and phosphorylation by other 
kinases (Manning et al., 2002). Intracellular signaling involves a multitude of diverse 
mechanisms that combine to modulate the activity of individual proteins in response 
to different biological inputs. 

[040] Defects in cell signal transduction pathways are responsible for a 
number of disorders, including the majority of cancers, inamime disorders, and many 

10 



wo 2005/005597 



PCT/US2003/027106 



inflammatory conditions, including, but not limited to, Crohn's disease (Geffen and 
Man, 2002; Van Den Blink et al., 2002; Lodish 1999). Over-expression and/or 
structural alteration of kinases, for example, receptor tyrosine kinase family members, 
is often associated with human cancers. For example, tumor cells are known to use 
receptor tyrosine kinases in transduction pathways to achieve tumor growth, 
angiogenesis and metastasis. Therefore, receptor tyrosine kinases represent pivotal 
targets in cancer therapy. A number of small molecule receptor tyrosine kinase 
inhibitors have been synthesized, are in clinical trials, are being analyzed in animal 
models, or have been marketed. Inhibitory mechanisms include ligand-dependent 
down regulation, e.g., by the adaptor Cbl (Brunelleschi et al., 2002). 

[041 ] Kinase-related sequences can possess or interact with protein kinase 
(pkinase) domains, which share a conserved catalytic core common in 
serine/threonine and tyrosine protein kinases (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=pkinase). Kinase-related sequences can also possess or interact 
with A-kinase anchoring protein 95 (AKAP95) domains, which comprise two zinc 
fingers, and have been implicated in chromosome condensation (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=AKAP95). Kinase-related sequences can also possess or 
interact with inositol 1,3,4,-trisphosphate 5/6 kinase (Insl34_P3_kin) domains, which 
mediate the function of inositol 1.3.4-trisphosphate, a branch point in inositol 
phosphate metabolism (http://pfam.wustl.edu/cgi-bin/getdesc?name= Insl34_P3_kin). 

[042] Kinases, by virtue of their participation in many and varied 
intracellular activities, are useful as targets of therapeutic intervention such as, for 
example, in cancer and inflammation. Cells transfected with cDNA encoding a kinase 
can be used in screening for small molecule agonists or antagonists, for example. 
Ligase-Related Sequences 

[043] Ligases are enzymes that join together, or ligate, two molecules. 
Ligase substrates include nucleic acids and proteins. For example, DNA ligases link 
two DNA molecules together; they play a role in DNA repair and replication. DNA 
ligases also are involved in the rearrangement of unmunoglobulin gene segments, 
such as those responsible for the generation of antibody diversity. Examples of 
protein ligases include ubiquitin protein ligases, which add an ubiquitin molecule to 
an amino acid residue, typically as part of a peptide or polypeptide. Examples of 
nucleic acid ligases include DNA ligase I, DNA ligase III alpha, and T4 RNA 
ligase 2. 
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[044] Ligases are also involved in cellular regulatory processes. For 
example, glutamate-cysteine ligase (GCL) is the first and rate-limiting en2yme 
involved in the biosynthesis of glutathione. Polymorphisms of human GCL account 
for differences in sensitivity to environmental toxicants and chemotherapeutic agents 
in human cancer cell lines (Walsh et al., 2001). Also by way of example, glutamate- 
ammonia ligase, or glutamlne synthetase (OS), is expressed at a higher than normal 
level in human primary liver cancer, and may be involved in hepatocjdie 
transformation (Chri^ta et al., 1994). 

[045] Ligase-related sequences can possess or interact with ATP dependent 
DNA ligase (DNA_ligase) domains, which can join two DNA fragments by 
catalyzing the formation of an intemucleotide ester bond between a phosphate and a 
deoxyribose (http://pfam.wustl. edu/cgi-biii/getdesc?name= DNA_ligase). Ligase- 
related sequences can also possess or interact with glutamate-cysteine ligase (GCS) 
domains, which, catalyze the rate-Hmiting step in the biosynthesis of glutathione. 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=GCS). Ligase-related sequences can 
also possess or interact with 2',5' RNA ligase (2 5 ligase) domains, which ligate 
tRNA half molecules containing 2 ',3 -cyclic phosphate and 5'hydroxyl terminal to 
products containing a 2'5'phosphodiesterlinkage (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=2_5_ligase). 

[046] Like kinases, Ugases are also useful as targets for identification of 
agonists and antagonists, such as small molecule drugs. 

Receptor-Related Sequences (Including Nuclear Hormone and T-Cell Receptors) 

[047] A receptor is a polypeptide that binds to a specific signaling 
molecule and initiates a cellular response. Receptors can be present on the cell 
surface or inside the cell. Example of receptor types include G-protein-linked 
receptors, ion channel-linked receptors, en2yme-linked receptors, T-cell receptors, 
thyroid hormone receptors, retinoid receptors, nuclear hormone receptors, and the 
related category of steroid hormone receptors, e.g., Cortisol receptors (Alberts et al., 
1994). 

[048] G-protein-linked receptors transduce extracellular signals into 

intracellular responses by interacting with guanine nucleotide binding proteins. The 
same ligand can activate many different G-protein-linked receptors. G-protein-linked 
receptors mediate cellular responses to a diverse range of signaling molecules, 
including hormones, neurotransmitters, and local mediators, which are varied in 
12 
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structure and function, and encompass proteins and small peptides, as well as amino 
acids and their derivatives, and fatty acids and their derivatives. Many signaling 
molecules are active at low concentrations, and their receptors often bind with high 
affinity. Examples of G-protein-Unked receptors include, but are not limited to, 
rhodopsins, olfactory receptors, and p-adrenergic receptors. 

[049] Ion channel-linked receptors are involved in synaptic signaling. 

These receptors regulate ion channels, to which they are linked. Some respond to 
signals from neurotransmitters, e.g., acetylcholine, serotonin, GABA, and glycine. A 
common mechanism of action for ion channel-linked receptors is to transiently open 
or close their respective ion channel, transiently changing the permeability of the 
membrane in which they reside to a specific ion or ions. 

[050] Enzyme-linked receptors can be Imked to enzymes or can 
function as enzymes. Their ligand binding site is commonly on one side of tiie 
membrane, e.g., an extracellular domain, and the catalytic site is on the other, e.g., a 
cytoplasmic domain. Transmembrane tyrosine-specific protein kinase receptors for 
growth and differentiation factors are enzyme-linked receptors; examples include 
receptors for epidermal growtli factor (EGF), platelet-derived gro^^^h factor (PDGF), 
fibroblast growth factors (FGFs), hepatocyte growth factors (HGF), insulin, insulin 
like growth factor-l (IGF-1), nerve growth factor (NGF), vascular endothelial growth 
factor (VEGF), and macrophage colony stimulating factor (M-CSF). 

[051] Nuclear hormone receptors generally function by crossing the 

plasma membrane of target cells and binding to intracellular protein ligands. Ligand 
binding activates these receptors in some instances, exposing a DNA binding domain 
which regulates the transcription of specific genes. Generally, nuclear hormone 
receptors bind to specific DNA sequences adjacent to or in the vicinity of the genes 
regulated by their ligand. A host of cell type-specific regulatory proteins can 
collaborate with the nuclear hormone receptor to influence the transcription of 
specific genes or sets of genes (Alberts et al., 1994). Examples of nuclear hormone 
receptors include esti'o gen-related receptors, such as hERRl, which modulates the 
estrogen receptor-mediated response of tlie lactoferrin gene promoter (Yang et al., 
1996), and is a transcriptional regulator of the human medium chain acyl coenzyme A 
dehydrogenase gene (Sladek et al., 1997). Examples of nuclear hormone receptors 
also include photoreceptor-specific nuclear receptors, such as NR2E3, which are part 
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of a large family of nuclear receptor transcription factors involved in signaling 
pathways. NR2E3 plays a role in cone function and human retinal photoreceptor 
differentiation and degeneration (Milam et al, 2002; Kobayashi et al., 1999). 

[052] T-cell receptors are membrane proteins comprised of two 

disulfide-linked polypeptide chains, each with two immunoglobulin-like domains. 
They display a similarity to antibodies in that they have a variable amino-terminal 
region and a constant carboxyl-terminal region which is coded for by variable, 
joining, and. constant region genes (Wei et al., 1 997; Alberts et al., 1 994). 
Rearrangement of T-cell receptor genes have been associated with human T-cell 
leukemias (Fisch et al., 1993). 

[053] Receptors are involved in cellular processes that regulate growth 
and differentiation. Their dysregulation can lead to hyperproliferative conditions, and 
they are common therapeutic targets. For example, the EGF receptor is aberrantly 
activated in neoplasia, especially in tumors of epithelial origin. EGF receptor 
antagonists can successfully treat some of these tumors, either alone or in 
combination with chemotherapy or ionizing radiation (Kari et al., 2003). The 
progesterone receptor, an intraceUular steroid hormone receptor, plays a role in the 
development and function of tlie mammary gland, the uterus, and the ovary. Mutation 
or aberrant expression of the progesterone receptor, or its regulatory molecules, can 
affect its normal function and lead to cancer (Gao and Nawaz, 2002). 

[054] Receptors are also involved in cellular processes that regulate 

iaflanmiation and immunity. For example, members of the type 1 interleukin-1 
receptor family mediate immune and inflammatory responses, and function in host 
defense. (O'Neill, 2002). Their activation can lead to the activation of signaling 
cascades, e.g., pathways involving transcription factors and protein kmases, resulting 
in an inflammatory response (O'Neill, 2002). Another mechanism by which receptors 
regulate inflammation and immunity is by their selective expression, at discrete stages 
of differentiation, by cells involved in the inflammatory response. For example, 
expression of the triggering receptor expressed on myeloid cells (TREM-1) and the 
myeloid DAP12-associating lectin (MDL-1) are correlated with myelomonocytic 
differentiation. Tliese receptors are more highly expressed in differentiated cells, are 
involved in monocyte activation and the inflammatory response, and are expressed at 
a lower level in malignant compared to normal cells (Gingras et al., 2002). 
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[055] Receptor-related sequences can possess or interact with seven 

transmembrane receptor (7tm_l ) domains, which are protein domains with a 
structural framework comprising seven transmembrane helices found in receptors, 
e.g., receptors in the rhodopsin family with a wide range of functions, activated by 
ligands that vary widely in structure and character (http://pfam.wustl.edu/cgi- 
bin/getdesc?name^7tm_l). Receptor-related sequences can also possess or interact 
with LI transposable element (transposase_22) domains, some of which have been 
characterized to exhibit reverse transcriptase activity, and some of which are capable 
of retrotransposition. Receptor-related sequences can also possess or interact with a 
SH2 domain, which is a protein domain of about 100 amino acid residues found in 
many intracellular signal-transducing proteins, that can regulate intracellular signaling 
cascades by interacting with phosphotyrosine-containing target peptides in a 
sequence-specific and phosphorylation-dependent manner (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=SH2). Receptor-related sequences can also possess or interact 
with LDL receptor domains, e.g., the low-density lipoprotein receptor repeat class B 
(Ldl recept b) domain, which comprises a conser\'ed YWTD motif in multiple 
tandem repeats (littp://pfam.wustl.edu/ cgi-bin/getdesc?name=ldl_recept_b). 
Receptor-related sequences can also possess or interact with ribosomal LIO 
(Ribosomal_L10e) domains, which are protein domains commonly found in the large 
ribosomal subunit (http://pfam.wustl.edu/cgi-bin/getdesc?name=Ribosomal_Ll Oe). 

[056] Receptor-related sequences can possess or interact with zinc 

finger C4 type domains, which are DNA binding domains of nuclear hormone 
receptors that share a conserved cysteine-rich region of approximately 65 amino acids 
and regulate such diverse biological processes as pattern formation, cellular 
differentiation, and homeostasis (http://www.sanger.ac.uk/cgi-bin/Pfam/getacc? 
PF00105). Receptor-related sequences can also possess or interact with a ligand 
binding domain of nuclear hormone receptors (hormone_rec), which are helical 
domains involved in the regulation of eukaiyotic gene expression, cellular 
proliferation, and differentiation in target tissues (http://www.sanger.ac.uk/cgi- 
bin/Pfam/getacc?PF00104). Receptor-related sequences can also possess or interact 
with Mov34 domains, which are regulatory subunits of the proteasome found in some 
regulators of transcription factors (http://www.sanger.ac.ulc/cgi-bin/Pfam/getacc? 
PF01398). Receptor-related sequences can also possess or interact with 
immunoglobulin domains, which are described above. 
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[057] Receptors, and fragments of receptors can be used as 

therapeutics. For example, a ligand-binding portion, an effector-binding portion, and 
a kinase or phosphatase domain or consensus sequence can comprise fragments that 
can function as agonists or antagonists enhance or reduce, e.g., ligand binding to the 
natural receptors, or effector function by the natural receptors. 
Phosphatase-Related Sequences 

[058] A phospliatase, as indicated above, is an enzyme that catalyses the 
hydrolysis of esters of phosphoric acid. Its substrates include, but are not limited to, 
nucleic acids, proteins, and lipids. Together with kinases, phosphatases are active in a 
broad range of cellular functions, including transcription, cell division, cell-cycle 
progression, intermediate cellular metabolism, glycogen metabolism, hpogenesis and 
lipolysis, maintenance of electrochemical gradients, neuronal function, immune 
responses, intracellular vesicular transport, cytoskeletal function, sperm motility, and 
skeletal, cardiac, and smooth muscle function. (Oliver and Shenolikar, 1998). 

[059] Disruption in these functions may lead to disorders. For example, as 
noted above, phosphatases regulate pathways of cell growth and programmed cell 
death; disruptions in these pathways can lead to abnormal cell growth, such as that 
which occurs in cancer. Mutations in serine/tlireonine protein phosphatase 2A 
(PP2A), a multifunctional regulator of cell growth and fimction, are associated with 
the increased growth of tumor cells (Schonthal, 2001). The tumor suppressor 
"phosphatase and tensin-homology deleted on chromosome 10" (PTEN) gene encodes 
PP3, a lipid phosphatase that dephosphoiylates phosphatidlyinositol, thus countering 
the action of the oncogenes Pls-ldnase and Akt, which promote cell survival. PTEN 
has been identified as a tumor suppressor; it is deleted in multiple types of advanced 
human cancers. 

[060] Also as noted above, phosphatases regulate pathways that control 
immune function. For example, the CD45 phosphotyrosine phosphatase is one of the 
most abundant glycoproteins expressed on immune cells, and regulates T-cell 
signahng and development (Alexander, 2000). In addition, the serine/threonine 
phosphatase calcineurin plays a central role in lymphocyte activation, among other 
important and wide-ranging cellular functions (Baksh and Burakoff, 2000). Certain 
compounds, specifically, cyclosporine and FK-506 (Tacrolimus), have been found to 
inhibit the phosphatase activity of calcineurin, thereby suppressing the production of 
IL-2 and other cytokmes. In addition, these compounds have recently been found to 
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block the INK and p38 signaling pathways triggered by antigen recognition in T-cells. 
Finally, phosphatase inhibitors have proven to be valuable as immune suppressant 
drugs, and those in the field believe that modulators of phosphatase activity proriiise 
to be important immunoregulatory compounds (AlUson, 2000). 

[06 1 ] Phosphatase-related sequences can possess or interact with protein 
phosphatase 2C (PP2C) domains, which display Mn^ or Mg"^ dependent protein 
serine/threonine phosphatase activity (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=PP2C). Phosphatase-related sequences can also possess or interact with 
protein-tyrosine phosphatase (Y_phosphatase) domains, which catalyze the removal 
of a phosphate group attached to a tyrosine residue (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Y_phosphatase). Phosphatase-related sequences can also possess 
or interact with protein phosphatase inhibitor l/DARPP-32 (DARPP-32) domains, 
which inhibit protein phosphatases, and play a role in regulating neurotransmitter 
pathways, receptors, and ion channels (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=DARPP-32). 

[062] Like kinases, phosphatases can be used as targets for therapeutic 
intervention, in cell-firee or cell-based assays, for example, in screening for drugs, 
including small molecule drugs. 
Protease-Related Sequences 

[063] Proteases, also known as endopeptidases, are enzymes that cleave 

polypeptide chains by hydrolyzing peptide bonds at positions within the amino acid 
chain. Different proteases recognize different polypeptide sequences. Endopeptidase 
substrate specificities vary from broad to narrow; for example, subtilisins are 
relatively non-specific, and can cleave polypeptide chains with a wide variety of 
amino acid sequences, whereas thrombin is more specific and can only cleave 
polypeptide chains with an arginine residue on the carboxyl side of the susceptible 
peptide bond and glycme on the amino side. Additional examples of protease-related 
sequences include coUagenases, trypsin, and damage-induced neuronal endopeptidase 
(Kiryu-Seo et al., 2000). 

[064] Proteases mediate the continuous remodeling of living tissues. For 
example, the extracellular matrix, a tissue skeleton that mediates communication 
among cells, and influences the structure and ftmction of associated tissues and 
organs, is continuously remodeled. A strictly controlled balance is maintained 
between breakdown of the extracellular matrix by proteases and reconstruction of the 
17 



wo 2005/005597 



PCT/US2003/027106 



extracellular matrix. This continued matrix remodeling is a dynamic process that 
shapes the structure and function of tissues and organs (Wojtowicz-Praga, 1999). 

[065] . Defects in protease function are responsible for a number of 
disorders, including cancer and other hyperproliferative disorders. Proteases are 
involved in the pathogenesis of such disorders both by virtue of their involvement in 
programmed cell death and tumor invasion and metastasis (Los et al., 2003 ; Stetler- 
Stevenson et al., 1993). Detection of the presence or characteristics of proteases can 
be used to screen for. and diagnose prostate cancer (Karanazanashvili and 
Abrahamsson, 2003).. Proteases are also involved in the pathogenesis of 
inflammatory and arthritic diseases, such as pancreatitis, osteoarthritis, and 
rheumatoid arthritis (Pfiitzer and Whitcomb, 2001 ; Martel-Pelleteir et al., 2001; Ler9h 
and Gorelick, 2000). 

[066] Protease-related sequences possess or interact with a variety of 
different protease domains, including domains belonging to the cysteine protease 
family, the serine protease family, and the metalloproteinase family 
(http://pfam.wustl.edu/cgi-bin/text search?terms=endopeptidase«&search_what= 
all&sections =DE&sections=CC&size=10). 
Phosphodiesterase-Related Sequences 

[067] Phosphodiesterases are enzymes that cleave phosphodiester 

bonds, i.e., bonds formed by two hydroxyl groups in an ester linkage to the same 
phosphate group, such as those between adjacent RNA or DNA nucleotides. 
Phosphodiesterases are found in both soluble and membrane-associated forms. Most 
phosphodiesterases act within a network of signal transduction molecules and other 
signaling effectors, and are modulated by components of these pathways. 
Phosphodiesterases regulate the metabolism and synthesis of cyclic nucleotides in 
signal-transduction pathways. They hydrolyze cAMP and cGMP, molecules that play 
an important and Avidespread role in signal transduction. Phosphodiesterases also 
repair damage to nucleic acids. Some phosphodiesterases are regulated primarily by 
calcium and calmodulin, others are regulated primarily by cGMP. They differ in their 
sensitivity to individual inliibitors, but all share a homologous catalytic region (Siegel, 
etal., 1999). 

[068] Examples of phosphodiesterases include nucleotide 

pyrophosphatases (NPP) and plasma membrane glycoprotein PC-1, which are present 
in elevated levels in the fibroblasts of patients with Lowe's syndrome (Funakoshi et 
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al., 1992). Another example of a phosphodiesterase is myomegalin-like protein, 
which is expressed at high levels in the nucleus and cytoplasm of heart and skeletal 
muscle (Soejima et al., 2001). Phosphodiesterases have demonstrated promise in 
cancer chemotherapy, analgesia, the treatment of Parkinson's disease, and the 
treatment of learning and memory disorders (Weishaar, et al., 1985). 

[069] Phosphodiesterase-related sequences can possess or interact with 

type I phosphodiesterase/nucleotide pyrophosphatase (phosphodiest) domains, which 
catalyze the cleavage of phosphodiester and phosphosulfate bonds 
(http://www.sanger.ac.uk/cgi-binyPfam/getacc7PF01 663). Phosphodiesterase-related 
sequences can also possess or interact with 3 '5 '-cyclic nucleotide phosphodiesterase 
(PDEase) domains, which are involved in signal transduction (http://www.sanger.ac. 
uk/cgi-bin/Pfam/getacc?PF00233). 

[070] Phosphodiesterases (PDEs) are also useful as targets for 
therapeutic intervention, for example, for identification of agonists or antagonists, 
such as in the screening of small molecule inhibitors. A well known PDE-5 inhibitor, 
sildenafil citrate (Viagra®) is used for treatment of erectile dysfunction (Brock, 
2000). The mechanism of action involves inhibition of PDE-5 enzyme and resulting 
increase in cyclic guanosine monophosphate (cGMP) and smooth muscle relaxation in 
the penis (Rosen and McKenna, 2002). Such inhibitors may also fmd use for 
treatment of severe pulmonary arterial hypertension. (Ghofrani et al., 2003). 
Kinesin-Related Sequences 

[07 1] Cells transport proteins and organelles in an orderly and 

regulated manner along cytoskeletal filaments. Molecular motor proteins, such as 
kinesins, can cany such cargo along the cytoskeletal filaments to specific 
destinations, in a highly regulated manner. Exemplary membrane-bound cargoes 
include mitochondria, lysosomes, endoplasmic reticulum, and axonal vesicles (Vale, 
2003). Kinesins also transport nonmembranous cargo, such as mRNAs, tubulin 
monomers, and intermediate filaments (V ale, 2003). 

[072] Kinesins, e.g., KIF 1 1 , function in the cell division process (Miki 

et al., 2001). In the nucleus, kinesins are necessary to establish spindle bipolarity, 
position cliromosomes on metaphase plates, and maintain forces in the spindle. 
Several members of the kinesin family are associated with the chromosomes, and are 
likely to perform a role in mitotic cluomosome movement (Miki et al., 2001). For 
example, the C-termmal kinesin KIFCl is involved in the processes of meiosis, 
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mitosis, and karyogamy (Miki et al., 2001). The kinesin GAKIN binds to the human 
analog of the Drosophila Discs Large tumor suppressor protein (hDlg), a membrane 
associated guanylate kinase (Hanada, 2000). GAKIN undergoes translocation in T- 
lymphocytes upon their cellular activation (Hanada, 2000). The GAKIN/hDlg 
complex is also hypothesized to play a role in cell division (Hanada, 2000). Thus, the 
kinesin GAKIN plays a role in cell proliferation and T-cell mediated immune 
function. 

[073], Kinesin-mediated intracellular transport is also implicated in as a 
mechanism of tumorigenesis. For example, kinesin transports the tumor suppressor 
adenomatous polyposis colon protein (APC) (Jimbo et al., 2002). The APC gene is 
mutated in both sporadic and familial colorectal tumors. The APC protein interacts 
with the microtubule plus-end-directed kinesin proteins KIF3A and KIF3B through an 
association with the kinesin superfamily-associated protein 3 (KAP3). Normally, the 
APC tumor suppressor is transported to its correct intracellular location at the tips of 
membrane protrusions. Mutant APCs derived from cancer cells, however, are unable 
to undergo kinesin-mediated transport, and do not accumulate with normal efficiency 
in clusters in the membrane protrusions, and thereby can not function efficiently as 
tumor suppressors. 

[074] In view of the connection to cancer, investigators have sought 

small molecules to inhibit specific molecular motors in cells, such as the mitotic 
kinesin Eg5/Ksp (Mayer, 1999). In addition, others have found small molecule 
inhibitors of Eg5/Kap with low nanomolar affinity have anti-tumor activity, and one 
such agent has entered clinical phase I trials (Vale, 2003). 

[075] In another arena, it has been proposed that impairing motor- 
driven delivery of MHC peptide complexes to the surface of dendritic cells could 
provide immunomodulation. Additionally, inhibiting the cell surface deUvery of 
cytotoxic granules in T cells could help provide immunosuppressive therapy (Vale, 
2003). 

[076] Kinesin-related sequences can possess or interact with kinesin 

motor (kinesin) domains, which liydrolyze ATP and bind to microtubules to produce a 
motor-active force that transports intracellular vesicles and organelles 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=kinesin). Kinesin-related sequences can 
also possess or interact with kinesin-associated protein (KAP) domains, which are 
non-motive domains that form a complex with kinesin (http://pfam.vrastl.edu/cgi- 
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bin/getdesc?name=KAP). Kinesin-related sequences can also possess or interact with 
MyTH4 domains, which are present in the tail of the motor ATPase proteins kinesin 
and myosin (http://pfam.wiTStl.edu/cgi-bin/getdesc?name=MyTH4). 

[077] Kinesins, like kinases, are useful as targets for therapeutic 

intervention, for example, in screening for small molecule inhibitors for the treatment 
of cancer. 

Immunoglobulin-Related Sequences 

[078] An immunoglobulin is an antibody molecule, and is typically 
composed of heavy and light chains, each of which have constant regions that display 
similarity with other immunoglobulin molecules and variable regions that convey 
specificity to particular antigens. Most immunoglobulins can be assigned to classes, 
e.g., IgG, IgM, IgA, IgE, and IgD, based on antigenic determinants in the heavy chain 
constant region; each class plays a different role in the immime response. 

[079] Immunoglobulins are characterized by a structural motif, the 
imraunoglobuhn (Ig) domain, which is approximately one hundred amino acids long, 
is involved in protein-protein and protein-ligand interactions, and includes a 
conserved intradomain disulfide bond (http:/7pfam.wustl.edu/cgi-bin/getdesc? 
name=ig). It is one of tlie most common domains found among all known proteins, 
and is present in hundreds of proteins with diverse functions. Proteins with the ig 
domain comprise the immunoglobulin superfamily; members include antibodies, T- 
cell receptors, major histocomptabihty proteins, the CD4, CDS, and CD28 co- 
receptors, most of the invariant polypeptide chains associated with B and T cell 
receptors, leukocyte Fc receptors, the giant muscle kinase titin, and receptor tyrosine 
kinases (Janeway et al., 2001; Alberts, et al., 1994). 

[080] Polypeptides with immimoglobulin-like domains can be markers for 
specific types of tissues and tumors. For example, a 43-kDa protein membrane 
antigen with two unmunoglobulin-like domains in its extracellular region is expressed 
in normal human colonic and small bowel epithelium and > 95% of human colon 
cancers, but absent from most other human tissues and tumor types (Heath et al., 
1997). 

[081] Polypeptides with immunoglobulin-like domains are also involved in 
inflammation. For example, myelin oligodendrocyte glycoprotein, a myeUn-specific 
protein found in the central nervous system, specifically binds to and activates 
complement, an effector of the inunune system, via its extracellular immunoglobulin- 
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like domain. By virtue of providing the means for an interaction between myelin and 
the complement component of the immune response, myelin oligodendrocyte 
glycoprotein is a modulator of central nervous system inflammation and has been 
predicted by those in the field to be relevant to the pathogenesis of demyelinating 
diseases such as multiple sclerosis (Johns and Barnard, 1 997). 

[082] Immunoglobulin-related sequences can also possess or interact with 
leucine-rich repeat domains, which are involved in protein-protein interactions, and 
are used in molecular recognition processes as diverse as signal transduction, cell 
adhesion, cell development, DNA repair and RNA processing 
(http://pfam.wustl.edu/cgi-bin/getdesc7name =LRE]SfT). Immunoglobulinrrelated 
sequences can also possess or interact with fibronectin type III repeat (&3) dpmains 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=fii3), which contain bindmg sites for 
DNA and heparin. Immunoglobulin-related sequences can also possess or interact 
with WASp Homology domain 1 (WHl), which can bind the metabotropic glutamate 
receptors mGluRl alpha and mGluR5 (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=WHl). 

Glycosylphosphatidylinositol Anchor-Related Sequences 

[083] Glycosylphosphatidylinositol (GPI) anchor proteins are 

synthesized as single membrane proteins; the transmembrane segment is cleaved 
away in the endoplasmic reticulum, where a GPI membrane anchor is added. The 
resulting protein is bound to the non-cytoplasmic, i.e., either extracellular or luminal, 
side of the membrane by the GPI anchor. GPI anchor proteins can be dissociated 
from the membrane by phosphatidylinositol-inositol-specific phospholipase C 
(Alberts et al., 1994). Examples of GPI-anchor proteins include prefoldin, a 
chaperone that delivers unfolded proteins to cytosolic chaperonin (Vainberg et al., 
1998), and carboxypeptidase M, which is associated with the differentiation of 
monocytes to macrophages (Rehli et al., 1995). 

[084] GPI anchor protein-related sequences can possess or interact with 

KE2 domains, which may contain a DNA binding leucine zipper motif(http://www. 
sanger.ac.uk /cgi-bin/Pfam/getacc?PF01920). GPI anchor protein-related sequences 
can also possess or interact with zinc carboxypeptidase (Zn_carbOpept) domains, 
which include carboxypeptidase H regulatory domains and carboxypeptidase A 
digestive domains (http://www.sanger.ac.uk/cgi-bin/Pfam/getacc7PF00246). 

22 



wo 2005/005597 



PCT/US2003/027106 



Other Polyp eptide-Related Sequences 
Activator-Related Sequences 

[085] An activator is a molecule or collection of molecules that 

positively modulates tlie activity of a regulatory protein, or that binds to DNA and 
regulates one or more genes by increasing the rate of transcription. Regulatory 
protein activators contribute to an increase in protein activity. Transcriptional 
activators provide a positive control over gene transcription; for example, they can 
sense the internal condition of the cell and bind to a sequence of DNA near a target 
promoter, resulting in the transcription of an appropriate gene. Examples of activator- 
related sequences include template-activating factors, bacterial catabolile activators, 
and the coenzyme thiamine pyrophosphatase. Activator-related sequences, e.g., 
factors that influence viral replication and transcription, can be encoded by oncogenes 
(Nagata et al., 1995). 

[086] Activator-related sequences can possess or interact with SH2 

domains, which are protein domains of about 100 amino acid residues foxind in many 
signal-transducing proteins. SH2 domains can regulate signaling cascades, e.g., by 
interacting with phosphotyrosine-containing tai-get peptides in a sequence-specific and 
phosphorylation-dependent manner (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=SH2). Activator-related sequences also possess or interact with nucleosome 
assembly protein (NAP) domains, which regulate gene expression, and are accessible 
to histones (http://pfam.wustl.edu/cgi- bin/getdesc?name=NAP). 
Adaptor-Related Sequences 

[087] Adaptors are proteins involved in the process of capturing 

specific cargo molecules into membrane-boimd vesicles for transport through the cell. 
Different adaptors recognize different receptors for cargo molecules, and also 
recognize different vesicle coat proteins, accounting, in part, for the specificity of the 
content of intracellular vesicles bound to specific destinations within the cell (Kirsch 
et al., 1999). Examples of adaptor-related sequences include adaptins, clathrins, 
adaptor-related protein complex subunits, and Cas Kgand with multiple Src homology 
3 domains (CMS) adaptors. 

[088] Adaptor-related sequences can possess or interact with src 

homology 3 (SH3) domains, which are small protein modules of approximately 50 
amino acid residues found in a variety of intracellular or membrane-associated 
proteins. SH3 domains are often indicative of a protein involved in signal 
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transduction events related to cytoskeletal organization, (http://pfam.wustl.edu/cgi- 
bin/getdesc?nanie=SH3). Adaptor-related sequences also possess or interact with the 
adaptin N-terminal (Adaptin_N) protein domain, which is found in the N terminal 
region of various adaptor protein complexes. The N-terminal region of axlaptor 
proteins is relatively constant in comparison to the C-tenninal (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=Adaptin_N). 

Adhesion Molecule-Related Sequences 

[089] Adhesion molecules are molecules that mediate the adhesion of 

cells with other cells, and with the extracellular matrix. Examples of adhesion 
molecules include members of the immunoglobulin superfamily, integrins,. cadherins, 
selectins, and transmembrane proteoglycans. The adhesion molecule 
carcinoembryonic antigen (CEA) is present nearly exclusively on cancer cells, and is 
expressed on the cell surface of approximately 80% of all solid cancerous tumors 
(Berinstein et al., 2002). 

[090] Adhesion molecule-related sequences can possess or interact with 

the immunoglobulin (ig) domain, which are described above. Adhesion molecule- 
related sequences can also possess or interact with integrin alpha cytoplasmic region 
(integrin_A) domains, which comprise the short, intracellular region of the integrin 
alpha chain http://pfam.wustl.edu/cgi-bin/getdesc?name=integrin_A). 

Antigen-Related Sequences 

[091] An antigen is a molecule that provokes an immune response; they 
include both foreign antigens and autoantigens. Antigens can be expressed in a 
tissue-specific manner and their expression can be developmentally regulated For 
example, the heat stable antigen HSA is expressed in both a tissue-specific manner, 
i.e., it is restricted to hematopoeitic cells, and a developlnentally-regulated manner, 
i.e., it is more higjily expressed in immature precursor cells than in terminally 
differentiated cells (Wenger et al., 1993). Antigens can be expressed on the cell 
surface or inside the cell, e.g., in the nucleus or on intermediate filaments. Antigen- 
related sequences include sequences related to tumor antigens, which are expressed 
exclusively in tumor cells, or in greater amounts in tumor cells than in normal cells. 
Timior antigens can be transmembrane proteins, with one or more transmembrane 
domains (Li et al., 1996; Linnenbach, et al., 1993). 

[092] Autoantigens, which are components of the body that provoke an 
immune response, are involved in the pathogenesis of autoimmune disease. 
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Autoantigens can be either selectively or ubiquitously expressed among cell and 
tissue types. They can be localized to any region of the cell, including the nucleus, 
nucleolus, nuclear envelope, and intermediate filaments (Racevskis et al, 1996). For 
example, pancreatic islet cell antigens are involved in the autoimmune pathogenesis 
of diabetes, and thyroid antigens are involved in autoimmune thyroid disease. 

[093] Antigen-related sequences can possess or interact with the ICAp69 
domain, which is characterized by a 69 kDa pancreatic islet cell autoantigen present in 
autoimmune (insulin-dependent) diabetes mellitus (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=ICA69). Antigen-related sequences can also possess or interact 
with the Ku70/Ku80 C-terminal aim (Ku_C) or Ku70/Ku80 N-terminal alpha^eta 
(Ku_N) domains, which belong to the Ku family of peptides (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=Ku_C; ,http://pfam.wiistl. edu/cgi-bin/getdesc? 
name=Ku_N). Ku, an antigen associated with autoimmune disease, normally 
functions to bind DNA double-strand breaks and facilitate DNA repair, but induces 
autoimmunity under pathological conditions. Antigen-related sequences can also 
possess or interact with the bZlP transcription factor (bZIP) domain, which comprises 
a basic region and a leucine zipper region (htlp://pfam.wustl.edu/cgi-bin/getdesc? 
name=bZIP). Antigen-related sequences can possess or interact with YT521 -B-like 
(YTH) domains, which comprise YT521-B, a tyrosine-phosphorylated nuclear protein 
domain that modulates alternative RNA splice site selection, and interacts with other 
nuclear proteins, e.g., scaffold attachment factor B, and Sam68, a 68-kDa substrate 
associated with Src during mitosis (http://pfam.wustl.edu/ cgi-bin/getdesc?name= 
YTH). 

ATPase-Related Sequences 

[094] ATPases are enzymes that use the energy of ATP hydrolysis to 

move ions or small molecules across a membrane against a chemical concentration 
gradient or electrical potential. For example, ATPases can maintain low intracellular 
calcium and sodium ion concentrations, and generate a low pH inside lysosomes, 

plant-cell vacuoles, and the lumen of the stomach. Vacuolar ATPases are ATP- 
dependent proton pumps that create pH gradients by transporting protons across 
membranes, while coupling the energy produced in the conversion of ATP to ADP 
with proton transport (Forgac, 1999). They can acidify or alkalinize cells, organelles, 
and extracellular compartments, and create voltage gradients that drive the secretion 
or absorption of ions and fluids (Wieczorek et al. 1999). Examples of ATPase-related 
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sequences include proton transporters, glucose transporters, multidrug resistance 
factors, calcium ATPases, and porins. 

[095] ATPase-related sequences can possess or interact with ATP 

synthase F/14-kDa subunit (ATP-synt-F) domains, which correspond to a 14-kDa 
subunit in the peripheral catalytic part of vacuolar ATPases (http://pfam.wustl.edu/ 
cgi-bin/getdesc?name=ATP-synt_F). ATPase-related sequences can also possess or 
interact with vacuolar (H'*)-ATPase C, D, G, and H subunit (V-ATPase) domains, 
which are membranei-attached sequences that generate an acidic environment 
(http://pfam.wustl.edu/cgi-biD/getdesc?name=V-ATPase_G). 

ATP-Related Sequences 

[096] Adenosine trisphosphate (ATP) is a nucleotide comprising an 

adenine, a ribose, and a trisphosphate unit. The trisphosphate unit contains two 
phosphoanhydride bonds that confer an energy-rich property to ATP. The ftee energy 
liberated in the hydrolysis of one or both of these bonds can drive reactions that 
require an input of free energy. A wide range of physiological and pathological 
processes are driven by the energ}^ of ATP, including cellular movement, the 
synthesis of biomolecules from precursors, muscle contraction, ciliary and flagellar 
function, intermediary metabolism, glycolysis, fatty acid oxidation, oxidative 
phosphorylation, and membrane transport (Ku et al., 1990). Examples of ATP-related 
sequences include ATPases, ATP synthases, ATP carrier proteins, and myosin. 

[097] ATP-related sequences can possess or interact with ATP- 

synthase subunit C protein domains (ATP-synt C), which are protein domains that 
consist of two long terminal hydrophobic regions, and are implicated in the proton- 
conducting activity of ATPases (http://pfam.wustl.edu/cgi-bin/getdesc?name=ATP- 
synt_C). ATP-related sequences can also possess or interact with mitochondrial 
carrier protein (mito_carr) domains, which are involved in energy transfer across the 
inner mitochondrial membrane (http://pfam.wustl.eduycgi-bin/getdesc? name= 
mitocarr). 

Binding Protein-Related Sequences 

[098] A binding protein is a protein that binds to anotiier molecule with 
specificity. Binding proteins can be involved in building macromolecular. structures, 
e.g., in cytoskeletal assembly or scaffolding (Machesky et al., 1997). Proteins often 
exist in the cell in complexes with other proteins, nucleic acids, lipids, and/or small 
molecules. For example, steroid receptors, e.g., the progestin, estrogen, androgen, 
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and glucocorticoid receptors, bind to heat-shock proteins and FKBP52, a calcium- 
regulated immunosuppressant, to form functional complexes (Peattie et al., 1992; 
Sanchez et al., 1990). DNA binding proteins and general transcription factors bind to 
the TATA box, a consensus sequence in a gene's promoter region that specifies the 
position of transcription initiation, forming a fiinctional transcription complex (Chalut 
etal., 1995). Proteins can interact with multiple molecules simultaneously. For 
example, TSfedd4, an ubiquitin-protein ligase, can interact with multiple proteins and 
lipids through its lipid binding domain and multiple protein binding domains (Jolliffe 
etal., 2000). 

[099] Proteins utilize a large number of motifs to bind other molecules. 
Binding protein-related sequences can possess or interact with the cold-shock DNA- 
binding (CSD) domain, a conserved domain of about 70 amino acids that helps the 
cell survive in temperatures below optimum growth temperature by inducing the 

synthesis of proteins that negatively regulate transcription, translation, and 
recombination, resulting in suppressed cell proliferation (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=CSD). Proteins induced by exposure to cold include DNA-binding 
proteins, and cold inducible RNA binding proteins, which have EN A binding 
domains at or near their N-termini (Nishiyama et al., 1997). For example, contrin, a 
testis-specific DNA/RNA binding protein with a cold shock domain also has a large 
number of phosphorylation sites, each of which can mediate intemiolecular 
interactions (Tekur et al., 1999). Contrin is involved in transcription of testis-specific 
genes; its inactivation could provide a reversible male contraceptive. 

[0 1 00] Binding protein-related sequences can possess or interact with the 
ARID/BRIGHT DNA binding (ARID) domain, which is an approximately 100 amino 
acid sequence involved in a wide range of DNA interactions, including, but not 
limited to, interaction with AT-rich regions (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=ARID). ARID-encoding genes are involved in a variety of biological 
processes, including regulation of cell growth, development, cell lineage gene 
regulation, cell cycle control, and tissue-specific gene expression. 

[0101] Binding protein-related sequences can also possess or interact with 
nucleosomal binding domains to facilitate binding within the nucleosome, a nuclear 
structure comprised of chromosomal DNA and proteins. For example, the HMG14 
and HMG17 (HMG14_17) domain is present in some nucleosome proteins, most 
commonly, in proteins HMG14 and HMG17, members of a family designated as high 
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mobility group proteins, which form components of chromatin, and bind to 
nucleosomal DNA, regulating the interaction of the DNA with histone proteins 
(http;//pfam.wustl. edu/cgi-bin/getdesc? name=HMG14_17). 

[0 1 02] Binding protein-related sequences can also possess or interact with 
conserved motifs that recognize RNA, and allow the protein to bind RNA 
(http ://pfam. wustl.edu/cgi-bin/textsearch?terms=ma+ binding&search_what= 
all&sections=DE&sections =CC&size=100). These motifs include the RNA 
recognition (rrm) do^lain, also known as a RRM, RBD, or RNP domain (http://pfam. 
wustl.edii/cgi-bin/getdesc?name=rrm). Numerous RNA binding proteins possess the 
rrm domain, including heterogeneous nuclear ribonucleoproteins (hnRNP) proteins, 
which are implicated in the regulation of alternative splicing, and LA proteins, which 
are among the main autoantigens in systemic lupus erythematosus (SLE). 

[01 03] Binding protein-related sequences can also possess or interact with 
conserved motifs that mediate their binding to ions, e.g., calcium. Calcium-binding 
proteins such as calmodulin, the calcineurins, and their homologues and related 
proteins are widely used to regulate cellular processes (http://pfam.wustl.edu/cgi- 
bin/textsearch?terms=calcium+binding& search_what=all&sections=DE&sections= 
CC&size=100). Ion-binding proteins include phosphoproteins that bind to other 
molecules in an manner dependent on their phosphorylation state, and can regulate 
many types of molecules and processes, including those that utihze complex signaling 
cascades (Pang et al., 2001; Pang et al., 2002; Lin et al., 1999). Ion-binding protein- 
related sequences can possess or interact Avith the EF hand (efliand) domain, a 
calcium-binding domain that comprises a loop of twelve amino acids that coordinates 
a calcium ion in a pentagonal bipyramidal configuration and is flanked on both sides 
by a twelve amino acid alpha-helical domain (http://pfam.wustl. edu/cgi-bin/getdesc? 
name=efhand). 

Breakpoint-Related Sequences 

[0 1 04] A breakpoint is the location on a chromosome where a gene is 
disrupted, and one segment of the gene is severed firom the other. Chromosomal 
breaks that disrupt coding or regulatory sequences can result in gene mutation. 
Chromosomal breaks can also serve as molecular landmarks, e.g., a break can be 
detected on Southern blots as the loss of an expected band and the appearance of two 
novel bands. Examples of breakpoint-related sequences include the sequences that 
generate the Philadelphia chromosome translocation, the sequences that generate the 
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chromosome translocation (t(l;7)(q42;pl5)), which is implicated in Wihns'tumor, 
and the sequences that generate the chromosomal translocation t(18;21)(q22.1q21.3), 
which is implicated in Down syndrome. 

[0105] Breakpoints commonly occur in discrete regions of the 
chromosome. Breakage at these regions can lead to a recognized disease phenotype. 
One way of generating such a phenotj^e is by chromosomal translocation, i.e., 
chromosomes mutate by exchanging parts. When a segment from one chromosome is 
exchanged with a segment from another nonhomologous chromosome, two mutated 
chromosomes are simultaneously generated (Griffiths, et al., 1999). The Philadelphia 
chromosome, a mutation sometimes associated with chronic myelogenous leukemia 
(CML), is an example. It results from the translocation of a discrete segment of 
chromosome 22 into a discrete region of chromosome 9. Patients with the 
Philadelphia chromosome mutation generally have a better prognosis than CML 
patients with other characteristics. 

[0106] Acquired clonal chromosomal abnormalities are found in the 
malignant cells of most patients with leukemia, lymphoma, and solid tumors. Some 
of these abnormalities are the result of consistent chromosomal rearrangements. For 
example, in a preponderant number of chronic myelogenous leukemia cases, 
breakpoints at chromosome band 22ql 1 occur within a breakpoint cluster region of 5- 
6 kb (Weinstein et al, 1988). 

[0107] Chromosome rearrangements affecting band 3 q2 1 are associated 
with a particularly poor prognosis in myeloid leukemia or myelodysplasia. These 
breakpoints cluster m a breakpoint cluster region of approximately 30 kb, located 
centromeric and downstream of the ribophorin I (RPN-1) gene (Weiser, 2002). The 
apoptotic gene bcl-2, was isolated as a breakpoint rearrangement in human follicular 
lymphomas and was shown to act as an oncogene that promoted cell survival rather 
than cell prohferation. 

[0108] Some proteins can act as leukemia or lymphoma-specific 
antigens for major histocompatibility complex-restricted T cell cytotoxicity. These 
include the breakpoint cluster region (bcr)-abl, and other fusion oncoproteins. 
Genetically engineered chimeric and humanized antibodies have demonstrated 
activity against overt lymphomas and leukemias. Radioimmunotherapy has produced 
significant therapeutic responses with minimal radiation exposure to normal tissues 
(Jurcic et al, 2000). 
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[0 1 09] Breakpoint-related sequences can possess or interact with 
RhoGAP domains, also knoAvn as the breakpoint cluster region-homology domain, 
and mediates signal transduction by small G proteins (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=RhoGAP). Breakpoint-related sequences can also possess or 
interact with RhoGEF domains, which comprise approximately 200 amino acid 
residues that encode a guanine nucleotide exchange factor (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=RhoGEF). Breakpoint-related sequences can also possess or 
interact with Plectin/lSlO (SlOjplectin) domains, which are found at the N-tenninus of 
some isoforms ofplectin and ribosomal SIO protein (http://pfani.wustl.edu/cgi- 
biii/getdesc?name=S 1 Ojplectin). 

Carrier or Transport-Related Sequences 

[0110] A membrane transport protein is an integral transmembrane protein 
that aids one or more molecules across a cell membrane. Most, if not all, types of 
molecules are transported across membranes, including proteins, ions, and fatty acids 
(Schaflfer and Lodish, 1994). Even molecules such as water and urea, wliich can 

diffuse across pure phospholipid bilayers, are frequently accelerated by transport 
proteins. Transporters clear cells of toxins, and confer drug resistance on tumor lines 
(Ramalho-Santos et al., 2002). The rate of transport varies considerably among 
membrane transport proteins. Membrane Ixansport proteins function in the plasma 
membrane and in intracellular organellar membranes, including the nuclear, 
mitochondrial, lysosomal, and vesicular membranes. For example, transportin, also 
known as karyopherin beta2, imports nuclear mRNA binding proteins from the 
cytoplasm across the nuclear membrane, into the nucleus (Bonifaci et al., 1997). 

[0111] Membrane transport proteins can have either a broad or a narrow 
range of specificity for the transported substance. In mammalian cells, nucleoside 
transport across membranes is mediated by broad specificity transporters. Nucleoside 
transport plays a role in such diverse cellular fimctions as nucleotide synthesis, 
neurotransmission, and platelet aggregation. Nucleoside transporters carry 
chemotherapeutic nucleosides, and are a target of interest in chemotherapeutic and 
cardiac drug design (Griffiths et al., 1997; Ku et al., 1990). 

[0112] Carriers are another class of membrane transport proteins; they bind 
to a solute and transport it across the membrane by undergoing a series of 
conformational changes. In contrast to channel proteins, transporters bind only one, 
or a few, substrate molecules at a time; after binding substrate molecules, they 
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undergo a conformational change such that the bound substrate molecules, and only 
those molecules, are transported across the membrane. Carriers transport a wide 
variety of molecules, including fatty acids across the plasma membrane (Schaffer and 
Lodish, 1994); purines, pyrimidines, and components of nucleosides across the 
nuclear membrane, and adenine nucleotides across the inner mitochondrial membrane 
(Battinietal., 1997). 

[0113] Membrane transport-related sequences can possess or interact with 
vacuolar (H^-ATPase C, D, G, and H subunit (V-ATPase) domains, which are 
membrane-attached sequences that generate an acidic environment 
(http://pfam.wustl.edu/cgi-bin/getdesc? name=V-ATPase_C). Membrane transport- 
related sequences can also possess or interact with nucleoside transporter 
(nucleoside ttan) domains, which are found in proteins that transport nucleosides 
across the plasma membrane, and are employed to synthesize nucleotides via the 
salvage pathways in cells that lack their own de novo synthesis pathways 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Nucleoside_tran). Membrane transport- 
related sequences can also possess or interact with ATP synthase F/14-ldDa subunit 
(ATP-synt-F) domains, which correspond to a 14-kDa subunit in the peripheral 
catalytic part of vacuolar ATPases (http://pfam.wustl.edu/cgi-bin/getdesc? 
nameF=ATP-synt_F). Membrane transport-related sequences can also possess or 
interact with mitochondrial carrier protein (mito carr) domains, which are involved in 
energy transfer across the inner mitochondrial membrane (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=mito_carr). Membrane transport-related sequences can also 
possess or interact with an AMP-binding enzyme (AMP-binding) domain, which is a 
domain rich in serine, threonine, and glycine, and is characterized by a conserved 
proline-lysine-glycine triplet sequence (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=AMP-binding). 

[0114] Membrane transport proteins, such as those expressed in cancer cells, 
are useful as targets for therapeutic intervention, for example, in the screening for 
small molecule inhibitors. Inhibition of membrane transport, as indicated above, may 
make cancer cells more susceptible to chemotherapy, for example. 

Channel-Related Sequences 

[0115] Channel proteins tiansport water or specific types of ions down 
their concentration or electrical potential gradients. Tliey form a protein-lined 
passageway across the membrane through which multiple water molecules or ions 
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move at a very rapid rate, e.g., up to 10^ per second. The plasma membrane, for 
example, contains potassium-specific channel proteins that generate the cell's resting 
electric potential across the plasma membrane. Examples of channel-related 
sequences include the sodium hydiogen exchanger, sodium potassium ATPase, and 
the cystic fibrosis transmembrane regulator. 

[0116] Members ofthis subset ofmembrane transport proteins have 
wide-ranging functions in both normal physiology and in pathology. For example, the 
transport system that mediates the transmembrane exchange of sodium for hydrogen 
across the plasma membrane plays a physiological role in the regulation of 
intracellular pH, the control of cell growth and proliferation, stimulus-response 
coupling, metabohc responses to hormones, the regulation of cell volume, and the 
tiansepithelial absorption and secretion of several ions. The sodium-hydrogen 
exchanger also plays a role in cancer and in tissue and organ hypertrophy 
(Mahnensmith and Aronson, 1985). 

[0117] Channel-related sequences can possess or interact with 
sodium/hydrogen exchanger (Na H Exchanger) domains, which exchange sodium 
for hydrogen across a membrane in an electroneutral manner (http://pfam.wustl. 
edu/cgi-bin/getdesc? name=Na_H_Exchanger). Channel-related sequences can also 
possess or interact with neurotransmitter-gated ion-channel ligand binding 
(Neur_chan_LBD) domains, which form the extracellular domains of some ion 
cliannels (http://pfam.wustl.eduycgi-bin/getdesc?name=Neur_chan_LBD). Channel- 
related sequences can also possess or interact with UBX domains, which are present 
in ubiquitin-regulatory proteins (http://pfam.AVustl.edu/ cgi-bin/getdesc?name=UBX). 
Checkpoint-Related Sequences 

[0118] The cell division cycle is the fundamental means by which living 
things are propagated. Fundamental to successful propagation is the faithful 
repUcation of DNA; a cell cycle control system exists to coordinate the cycle as a 
whole. The control system is regulated by brakes that can stop the cycle at specific 
checkpoints. Thus, the checkpoints arrest the cycle upon the occurrence of 
undesirable events, such as DNA damage, replication stress, or mitotic spindle 
disruption. For example, DNA lesions and disrupted replication forks are recognized 
by the DNA damage checkpoint and replication checkpoint, respectively. 
Checkpoints can also, for example, initiate protein kinase-based signal transduction 
cascades to activate doAvnstream effectors that elicit cell cycle arrest, DNA repair, or 
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apoptosis. These actions prevent the conversion of aberrant DNA structures into 
inheritable mutations and minimize the survival of cells with unrepairable damage 
(Qin and Li, 2003). 

[0119] Dysregulation of the cell-cycle is a hallmark of tumor cells. 
Defective checkpoint function results in genetic modifications that contribute to 
tumorigenesis. Checlqpoint function can be abrogated by many different mechanisms 
■ (Bast, et al,, 2000). For example, cyclin-dependent kinases that normally are 
activated at a checlqpoint can be inactivated or activated in an abnormal maimer. 
Alternatively, the normal activities of the cyclin-dependent kinase inhibitors, 
phosphatases, or other regulatory molecules of the cell cycle can be altered. Tumor 
suppressors are among the classes of molecules that can effect cell cycle 
dysregulation. The abrogation of checkpoint function can alter the sensitivity of 
tumor cells to chemotherapeutics (Stewart et al, 2003). 

[0120] Checlqpoint-related sequences can possess or interact with 
phosphoribosylaminoimidazole-succinocarboxamide synthase (SAICAR_synt) 
domains, which function in de novo purine synthesis (http://pfam.wustl.edu/cgi- 
bin/getdesc?name =SAICAR_synt). Checkpoint-related sequences can also possess 
or interact with WD40 domains, which comprise a domain of approximately 40 amino 
acids, which are sometimes present in tandem repeats (http;//pfam.wustl.edu/cgi- 
bin/getdesc?name=WD40). Checkpoint-related sequences can also possess or interact 
with cyclin, C-terminal (cyclin_C) domains, which regulate cyclin dependent kinases 
(http://pfam.wustl.edu/cgi-bin/getdesc? name=cyclin_C). 

[0121] Thus, checkpoint related proteins, e.g., kinases, phosphatases, 
etc., are useful as targets for therapeutic intervention, such as in screening for small 
molecule drugs for the treatment of cancer, immune disorders, and inflammation. 

Complex-Related Sequences 

[0122] Complexes are molecular entities comprised of two or more 
components. Molecular complexes within cells form functional units that carry out 
cellular operations. For example, complexes at the cell membrane perform structural 
and regulatory tasks, including regulating membrane traffic and maintaining organelle 
integrity. Complexes at the cytoskeleton perform static and dynamic roles with 
respect to cell shape, intracellular transport, and communication with the extracellular 
matrix. Complexes in the nucleus transcribe and regulate genes, and complexes at 
sites of protein synthesis translate and regulate proteins. Complexes can reside 
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intracellularly and/or extracelMarly, e.g., in the extracellular matrix. Examples of 
complex-related sequences include cytoskeletal and filamentous proteins, ADP- 
ribosylation factor (ARF) proteins, and protein synthesis initiation factors (Amor et 
al., 1994). 

[0 1 23] Complex-related sequences can possess or interact with ADP- 
ribosylation factor family (arf) domains, which are GTP-binding domains involved in 
protein trafficking (http://pfam.wustl.edu/cgi-bin/getdesc?name=arf). Complex- 
related sequences can also possess or interact with eukaryotic initiation factor 
domains, e.g., the eukaryotic initiation factor 4E (IF4E) domain, which recognizes 
and binds mRNA during protein synthesis (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=IF4E). Complex-related sequences can also possess or interact with 
intermediate filament (filament) protein domains, which form filamentous structures 
typically 8 to 14 nm wide, and form components of the cytoskeleton and nuclear 
envelope, e.g., neurofilaments, cytokeratins, lamins, vimentin, and desmin 
(http;//pfam.wstl.edu/cgi-bin/getdesc?name==filament). 

Cytokine-Related Sequences 

[0124] A cytokine is an extracellular signaling protein or peptide that acts as 
a local mediator in communication among cells. Cytokines regulate prohferation and 
differentiation, for example, they mediate differentiation of cells in the hematopoeitic 
lineage. Examples of cytokines include interleukins, interferons, and colony 
stimulating factors of the hematopoeitic system. Some cytokines, e.g., interferons and 
interleukins, can be induced by viral activity, and possess antiviral activity (Sheppard 
et al., 2003). Cytokine-related sequences may enable the expression of a cytokine, for 
example, as a cytokine transcription factor (Bvao et al., 1994). They can also be part 
of a cytokine effector pathway, for example, as an intracellular effector of cytokine- 
related cytoskeletal changes in response to events in the extracellular matrix (Hirsh et 
al., 2001; Joberty et al., 1999). 

[0125] Cytokine-related sequences can possess or interact with interferon- 
induced transmembrane protein (CD225) domains, which are associated with 
interferon-induced cell growth suppression (http://pfam.wustl.edu/ cgi- 
bin/getdesc?name=CD225). Cytokine-related sequences can also possess or interact 
with SelR (SelR) domains, which bind both selenium and zinc, and/or methionine 
sulfoxide reductase enzymatic domains (http://pfam. wustl.edu/cgi- 
bin/getdesc?name=SelR). Cytokine-related sequences can also possess or interact 
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with reverse transcriptase (rvt) domains, which are involved in RNA-directed DNA 
polymerase activity, an enzymatic activity that uses an KNA template to produce 
DNA for integration into a host genome (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=rvt). Cytokine-related sequences can also possess or interact with LI 
transposable element domains (Transposase_22), which are described above. 

[0126] Cytokines, thus, are useful as therapeutic proteins for the treatment of 
disorders such as cancer, immune disorders, and inflammation. 

Dehydrogenase-Related Sequences 

[0127] Dehydrogenases are enzymes that catalyze the removal of 
hydrogen atoms in the absence of oxygen. They contribute to a -wide range of 
enzymatic reactions, including those involved in amino acid degradation, amino acid 
synthesis, the citric acid cycle, fatty acid oxidation, fatty acid synthesis, glycolysis, 
the pentose phosphate pathway, photosynthesis, pyruvate oxidation, and oxidative 
phosphorylation (Walker et al., 1992). Examples of dehydrogenases include steroid 
dehydrogenases, NADH dehydrogenases, and glyceraldehyde-3-phosphate 
dehydrogenase. 

[0128] Dehydrogenase-related sequences can possess or interact with 
glyceraldehyde 3-phosphate dehydrogenase, NAD binding (GPDH) domains, which 
play a role in glycolysis and gluconeogenesis by reversibly catalyzing tlie oxidation 
and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate 
(http://pfam.wustl.edu/cgi-bin/getdesc?name^gpdh). Dehydrogenase-related 
sequences can also possess or interact with 3-hydroxyacyl-CoA dehydrogenase, NAD 
binding (3HCDH_N) domains, which catalyze the reduction of 3-hydroxyacyl-CoA to 
3-oxoacyl-CoA in fatty acid metaboUsm (http://pfem.wustl.edu/cgi-bin/getdesc? 
name=3HCDH_N). 

Disease-Related Sequences 

Amyotrophic Lateral Sclerosis 

[0129] Amyotrophic Lateral Sclerosis (Lou Gehrig's Disease) is a 
neurodegenerative disease that affects the motor neurons. The disease displays 
multiple clinical variants and can affect motor neurons tliroughout the nervous 
system, e.g., the spinal cord and brainstem. One clinical variant, the autosomal 
recessive form of juvenile amyotrophic lateral sclerosis, has been mapped to the 
human chromosome 2q33-q34 region (Hadano et al, 2001). A protein family 
characterized by the HAPl N-terminal conserved region (HAPlJSf) domain possesses 
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a N-terminal conserved region from hypotlietical protein products of ALS2CR3 genes 
found in the 2q33-2q34 region of chromosome 2 (http://pfam.wustl.edu/cgi- 
bin/getdesc?name= HAP1_N). 
Gaucher 's Disease 

[0130] Gaucher's Disease is a genetic disease characterized by a deficiency 
of enzymes responsible for the breakdown and recycling of glycolipids, i.e., lipids 
with carbohydrate moieties, e.g., glucosylceramide; and sphingolipids, lipids with 
sphingosine moieties,, e.g., sphingomyelin. Normally, the glycolipids and 
sphingolipids in the membranes of senescent cells are metabolized by a multi-step 
process that includes the activities of acid beta-glucosidases and saposins. When 
these activities are absent, or present in reduced amounts, glucosylceramide and 
sphingolipids accumulate, and produce the Gaucher's disease phenotype. The disease 
displays multiple clinical variants, and can manifest with central nervous system 
pathology, enlargement of organs, e.g., liver and spleen, and an increase in the level 
of the cytokine transforming growth factor beta (Zhao and Grabowski, 2002; Perez 
Calvo et al., 2000; Comiand et al, 1997). The variability in clinical presentation is 
consistent with the large number of different mutations observed in the acid beta- 
glucosidase and saposin genes. 

[0131] Acid beta-glucosidases are enzymes that metabolize glycolipids. 
Saposins are small proteins that are described in more detail below. Mammalian 
saposins are synthesized as a single precursor molecule (prosaposin) with saposin-A - 
(SAPA) and saposin-B (SapB_l; SapB_2) domains; prosaposin becomes an active 
saposin following a proteolytic activation reaction (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=SAPA; http://pfam.wustl. edu/cgi-bin/getdesc?name=SapB_l; 
http://pfam.wustl.edu/cgi-bin/getdesc?name=SapB_l). 

Huntington Disease 

[0132] Huntington Disease is a progressive neurodegenerative genetic 
disorder characterized by dementia, psychiatric symptoms, and a choriform 
movement disorder. It is caused by an increased number of repeats of the codon 
CAG, which encodes the amino acid glutamine, in a gene located at the 4pl6.3 region 
of chromosome 4, which codes for a protein called huntingtin. The polyglutamine 
tracts expressed by the mutant form of the gene selectively ablate striatal and cortical 
neurons, (Ho et al., 2001). 
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[0133] The Huntington Disease gene is widely expressed, but exerts tissue- 
specific effects on neurons (Lin et al., 1993). The gene expresses multiple distinct 
transcripts, and differential polyadenylation of the gene leads to the expression of 
transcripts of different sizes (Lin et al., 1993). There is a relative increase in the 
abundance of one transcript in the human brain, which has been hypothesized to 
account for the tissue-specific effects of the disease (Lin et al., 1993). The HAP1_N 
protein domain, described above, binds to the gene product, huntingtin, in a 
polyglutamine repeat-length-dependent maraier (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=HAPl_N). This domain is also found in several huntiiigtin- 
associated protein 1 (HAPl) homologues. 

Multiple Sclerosis (MS) 

[0 1 34] Multiple sclerosis (MS) is a disease characterized by demyelination, 
i.e., the loss of the myelin coating, of nerve axons. Its clinical course varies among 
patients; these variations fall into two broad categories, a relapsing/remitting course, 
and a chronic progressive course. MS has a complex etiology; it has an autoimmune 
component, is influenced by genetics, and sometimes involves infectious agents. MS 
resuhs from an abnormal immune response to one or more antigens present in the 
myelin sheaths that cover the nerve axons of genetically susceptible individuals, 
which may be preceded by exposwe to a causal infectious agent (Oksenberg et al., 
1999). 

[0135] The genetic susceptibility to MS is determined by MS susceptibiUty 
genes, most of which demonstrate only a small to moderate effect on susceptibility, 
e.g., the major liistocompatibility complex at chromosome 6p21 (Oksenberg et al., 
1999). An etiological infectious agent has been isolated from the plasma and 
cerebrospinal fluid of patients with multiple sclerosis (Perron et al., 1997). This agent 
is a retroviral oncovirus, known as multiple sclerosis-associated refrovirus (MSRV), 
also called LM7, and is found in association with virions produced by the cultured 
cells of MS patients (Perron et al., 1997). MSRV proteins possess protfein domains 
characteristic of retroviral proteins. These include the Gag P30 core shell protein 
(Gagjp30) domain, which is involved in viral assembly (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Ga^30) and the reverse transcriptase (rvt) domain, which was 
described above. 
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Obesity 

[0136] Although single-gene mutations have been shown to cause obesity in 
animal models, the most common forms of human obesity arise from the interactions 
of multiple genes, environmental factors, and behavior. Several genes have been 
shown to affect body, weight regulation in humans and other animals. These include 
the ob, lep, CPE, ASIP, LEP, TUB, UPC, POMC, CCKAR, TNFA, and PPAR-y 
genes (Comuzzie et al., 1998). Genetic regulation of body weight can be effected 
through diverge mechanisms. For example, the TUB gene family regulates body 
weight by encoding proteins that are phosphorylated in response to insulin, mediate 
insulin signaling, and are associated with a maturity onset obesity associated with 
insulin resistance (Ikeda et al., 2002). CCKAR genes regulate body weight in a 
different manner; they regulate the hormone cholecystokinin, which produces a 
feeling of satiety following food intake (Ritter et al., 1994). 

[0137] Some genes that regulate body weight possess the WHl domain, 
which is described above. Genes that regulate body weight can also possess or 
interact with the sprouty (sprouty) domain. This domain is foimd in sprouty proteins, 
which inhibit tlie Ras/mitogen-activated protein kinase cascade, a pathway initiated 
by receptor tyrosine kinases and involved in development (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Sprouty). Genes that regulate body weiglit can also possess or 
interact with a Tub (Tub) domain, which is found in Tubby, a mouse gene in which an 
autosomal recessive mutation resulting from a splicing defect causes maturity-onset 
obesity, insulin resistance and sensory deficits (http://pfam, wustl.edu/cgi- 
bin/getdesc?name=Tub). 

Oncogene 

[0138] An oncogene is any one of a large number of genes that can help 
make a cell cancerous. Typically, an oncogene is a mutant form of a normal gene, 
and is often a gene involved in the confrol of cell growth, division, or differentiation. 
Cells in higher organisms normally grow, divide, differentiate, and die xmder the 
regulation of other cells. Cancer cells proliferate, in part, because they are able to 
divide without input from other cells, as the result of accumulated mutations. 
Oncogenes include, but are not limited to, genes encoding GTP binding proteins, e.g., 
ras; growth factors, e.g., platelet-derived growth factor; growth factor receptors, e.g.. 
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platelet-derived growth factor receptor; kinases, e.g., src; nuclear proteins, e.g., myc; 
and tumor suppressors, e.g., retinoblastoma proteins. 

^0139] The products of oncogenes are frequently proteins involved in cell 
signaling, e.g., kinases, GTP-binding proteins, and receptors. For example, many 
human cancers have a mutation in a ras gene (Alberts et al., 1994). The ras proteins 
belong to a large superfamily of monomeric GTPases, and relay signals from receptor 
tyrosine kinases to the nucleus, stimulating cell proliferation or differentiation. Ras 
proteins fimction as switches, cycling between an active state in which.GTP is bound, 
and an inactive state, in which GDP is bound. A ras gene mutation can result in the 
tiaiislation of a protein tiiat fails to hydroly2e its bound GTP, and persists abnormally 
in its active state, transmitting an intracellular signal for cell proliferation or 
differentiation even in the presence of regulatory non-proliferation and non- 
differentiation signals. Oncogene-related proteins can possess one of many ras 
protein domains (htip://pfam.wusti.edu/cgi-bin/textsearch?terms=ras&search_ 
what=all&sections=DE &sections=CC«fesiz;e=100), including the sub-families Ras, 
Rab, Rac, Ral, Ran, Rap, and Yptl . Oncogene-related proteins can also possess a 
Gtrl/RagA G-protein conserved region (gtrl RagA) domain, which is found in some 
G-proteins of the Ras family, e.g., the RagA/B human homologues of the ras GTP 
binding protein Gtrl (http://pfam.wustl.edu/cgi-bin/getdesc?name=Gtrl_RagA). 
Oncogene-related sequences can also possess or interact with an ATPase domain 
associated with diverse cellular activities; proteins with the AAA ( ATPases 
'A'ssociated with diverse cellular 'Activities) domain can perform chaperone-like 
functions that assist in assembling, operating, or disassemblmg protein complexes. 
The domain includes a conserved region of approximately 220 amino acids that 
contains an ATP-binding site which can act as an ATP-dependent protein clamp to 
hold a protein in place (http://pfam.wxisti.edu/cgi-biii/getdesc?name=AAA). Some 
oncogene-related sequences can also possess or interact with a C2 domain of 
approximately 116 amino-acid residues, which can be involved in calcium-dependent 
phospholipid binding and inositol- 1 ,3,4,5-tetraphosphate binding, and is found, e.g., 
in some isozymes of protein kinase C (http://pfam.wusti.edu/cgi- 
bin/getdesc?name=C2). C2 domains are typically located between CI domains 
(which bind phorbol esters and diacylglycerol) and protein kinase catalytic domains. 
Regions with homology to the C2 domain are present in many proteins, e.g., 
synaptotagmin. 
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Parkinson 's Disease 
[0 140] Parkinson's disease is a neurological disorder that affects movement 
control. Complex interactions among groups of nerve cells in the central nervous 
system coordinate to control movement. One such group of neurons is located in the 
substantia nigra of the midbrain; these neurons release the neurotransmitter dopamine, 
which allows an organism to fine-tune its movements. In Parkinson's disease, neurons 
of the substantia nigra progressively degenerate, leaving the patient with clinical 
symptoms that may include resting tremor, muscular rigidity, a slowness of 
spontaneous movememt, and poor balance and motor coordination (Seigel et al., 
1999). 

[0141] Parkinson's disease has multiple causes, including both genes and the 
environment. It also has multiple presentations, including juvenile-onset (before age 
45) and adult onset (after age 45), and can be transmitted through either autosomal 

dominant or autosomai recessive mechanisms. In keeping witli the diversity of 
etiologies, presentation, and genetic mechanisms, there are a large and diverse number 
of genes and gene products involved in the pathogenesis of Parkinson's disease. For 
example, the PARK2 gene, which encodes the protein parkin, is mutant in autosomal 
recessive juvenile parkinsonism. PARK2 is a ubiquitin protein hgase that is a 
component in the pathway that attaches ubiquitin to specific proteins, designating 
them for degradation (Fishman, and Oyler, 2002). 

[0142] Parkinson's disease-related sequences can possess or interact witii 
synuclein domains, which are expressed on the cjdoplasmic regions of proteins found 
predominantiy in neurons (htip://pfem.wusti.edu/cgi-biQ/getdesc?name=Synuclein). 
Alpha-synuclein, which possesses a synuclein domain, is mutated in several families 
with autosomal dominant Parkinson^ disease. Gamma-synuclein, which also 
possesses a synuclein domain, is overexpressed in breast and ovarian cancers 
(Lavedan, 1998). 

Retinitis Pigmentosa 

[0 1 43] Retinitis pigmentosa is a group of inherited retinopathies 
characterized by early stage loss of night vision, followed by loss of peripheral vision. 
Defects in any structural or fimctional proteins associated with the rod photoreceptor 
neurons of the retina, which are the cells that transduce Hght into a neuronal action 
potential, can lead to the disease (Seigel et al., 1999). 
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[0 144] GTPase regulators have been implicated in the pathology of retinitis 
pigmentosa. GTPase regulators are proteins that determine whether a OTP binding 
protein exists in a OTP-bound or GDP-bound state (Zhao et al., 2003); they are 
described in more detail below. GTPase regulators have a broad spectrum of 
intracellular functions, including intracellular vesicular transport. These proteins 
localize to a specific region of rod photoreceptor cells, in a narrow cilium that 
connects the cell body, where protein synthesis and basic metabohsm takes place, 
with the rod outer segment, where light is transduced to an action potential of the 
optic nerve (Zhao et al., 2003). Proteins necessary for the light transduction process 
are made in the cell body and must be transported to the outer segment via vesicular 
transport mechanisms. Mutant GTPase regulators, which regulate vesicular transport, 
play a role in the pathogenesis of retinitis pigmentosa (Roepman et al., 2000). 
Retinitis pigmentosa-related sequences can possess or interact with a Tctex-1 domain, 
which is comprised of a dynein Ught chain, and can bind to the cytoplasmic tail of 
rhodopsins, which are light-sensing proteins present in retinal rod cells 
(http://pfam.wustl. edu/cgi-bin/getdesc?name=Tctex-l). Mutations in this domain 
that are responsible for retinitis pigmentosa inliibit this binding. 
Alzheimer 's Disease 

[0145] Alzheimer's disease is a neurodegenerative dementing illness. It is a 
genetically complex disease with multiple forms, including familial and sporadic 
forms, and early onset and late-onset forms. Mutations in at least four genes are 
known to cause Alzheimer's disease, and there is evidence for additional Alzheimer's 
loci (McKusick, 2003). One form of Alzheimer's disease is caused by mutations in 
the amyloid precursor gene, another form is associated with the apoUpoprotein E4 
allele, a third form is caused by a mutant presenilin-1 gene that encodes a seven- 
transmembrane domain protein, and a fourth form is caused by a mutant gene 
encoding a similar seven -transmembrane domain protein, presemlin-2 (McKusick, 
2003). 

[0146] Consistent with its multiple etiologies, multiple clmical presentations, 
and multiple genetic loci, Alzheimer disease has a complex pathology. One facet of 
the pathology of Alzheimer's disease is the formation of amyloid plaques from 
amyloid precursor protein (Clark and Karlawish, 2003). Amyloid precursor protein 
can be processed in vitro by several different proteases such as secretases and 
caspases to yield peptide fragments, suggesting that these proteases may play a role in 
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the formation of pathogenic amyloid plaques in vivo (Suh and Checler, 2002). 
Presenilins have been identified as likely candidates for the proteases that cleave 
amyloid precursor protein to pathogenic peptide fragments in vivo (Selkoe, 2001). 
Another facet of Alzheimer's disease pathology is an inflammatory component 
mediated by microglial cells, the brain's primary immunoeffector cells (Tan et al., 
1999). Microglial ceils are attracted to and activated by amyloid deposits; they 
release inflammatory mediators that promote the aggregation of the deposits into 
plaques, and also dirpctly induce or promote neurodegeneration (Hoozemans et al., 
2002). Therefore^ current treatment strategies include anti-inflammatory and 
immunotherapeutic approaches, including vaccines (Weiner and Selkoe, 2002). 

[0147] Alzheimer's disease-related sequences can possess or interact with 
trypsin domains, which demonstrate a wide range of peptide degrading activities, 
including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activities 
(http://pfam. wustl.edu/cgi-bin/getdesc?name=trypsin). Alzheimer's disease-related 
sequences can also possess or interact with low-density lipoprotein receptor (ldl_rece) 
domains, which are characterized by seven successive cysteine-rich repeats of about 
40 amino acids at the N-terminal region, and which are also present in receptors for 
low density lipoprotein (LDL), the major cholesterol-carrjdng lipoprotein of plasma 
(http://pfam.wusti.edu/cgi-bin/textsearch?terms=ldl_rece +&search_what=all& 
sections =DE&sections=CC&size=100). Alzheimer's disease-related sequences can 
also possess or interact with a PT repeat (pt_a) domain, which includes the 
tetrapeptide XPTX, or a similar, conserved, sequence. 

Williams-Beuren Syndrome 

[0148] Williams-Beuren syndrome is a complex genetic developmental 
disorder witii multisystemic manifestations, and variability in its presentation. In 90- 
95% of tiie cases reported, a gene deletion occurs at tiie 7ql 1 .23 location on the long 
arm of chromosome 7; in the remaining cases, a variety of other chromosomal 
deletions and translocations have been observed (Wang et al., 1999). The most severe 
cases are characterized by cardiac anomalies, including aortic stenosis, mental 
retardation, growth deficiency, a characteristic facial appearance, dental 
malformation, and infantile hypercalcemia (Lashkari et al., 1 999). 

[0149] The underlying molecular basis for the syndrome is tlie absence of 
the proteins encoded by the genes of the affected region of the chromosome. A 
missing elastin gene, with resulting extracellular matrix anomalies, is a consistent 
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finding. Other genes that are present in and near the conmionly deleted region of 
chromosome 7, and thus are hkely to contribute to pathogenesis, are (1) a gene 
encoding a regulator of chromosome condensation-like G-exchanging factor, which is 
a factor that exchanges nucleotides for small GTP-binding proteins, (2) an N- 
acetylgalactosaminyltransferase, (3) a DNAJ-like chaperone, (4) N0Ll/N0P2/sun 
domain-containing proteins, including a novel protein designated WBSCR20, which 
is expressed in skeletal muscle, and is similar to a 1 20 kilodalton proUferation- 
associated nucleolar antigen, (5) a methyltransferase designated WBSCR22, and (6) 
other proteins with no known homologies (Merla et al., 2002; Doll and Qrzeschik, 
2001). Williams-Beuren-related sequences can possess or interact with a GTF2I-like 
repeat (GTF2I) domain, which is a DNA binding domain commonly deleted in 
Williams-Beuren syndrome, (http://pfam.wustl.edii/cgi-bin/getdesc7nameKjTF2I). 
Rheumatic Diseases 

[01 50] Rlieiunatic diseases are inflammatory conditions that can have 
autoimmune, infective, or traumatic origins. They include arthritis, systemic lupus 
erythematosus, scleroderma, and Sjogren's syndrome. Arthritis refers to any 
inflammation of a joint. Systemic lupus erythematosus is an autoimmune disease in 
which patients produce antibodies to their own tissues, resulting in an inflanmiatory 
process that can damage organs. Scleroderma can present as systemic scleroderma, a 
chronic, progressive disease that is characterized by hardening and stiffening of the 
skin and damage to internal organs, e.g., heart, lungs, kidneys and esophagus. 
Sjogren's syndrome is a progressive immunological disorder characterized by 
inflammation and the subsequent destruction of exocrine glands, e.g., salivary glands, 
sweat glands, and lacrimal (tear) glands. 

[0151] The serum of patients with scleroderma and Sjogren's syndrome have 
antibodies directed against a protein that is a normal component of the Golgi 
apparatus (Seelig et al., 1994), an intracellular organelle composed of a stack of 
flattened cistemae with associated transport vesicles. The Golgi apparatus sorts 
proteins and sends them to their correct intracellular destination. This antigenic 
protein is a "golgin," one of a class of molecules characterized by an integral 
membrane domain and a large c^'toplasmic region. Golgms organize the Golgi 's 
structure, and influence protein sorting (Gillingham et al., 2002). Golgins function in 
a variety of ways, including cross-bridging Golgi cistemae to one another (Linstedt 
and Hauri, 1993) and tethering Golgi transport vesicles to the cisternal membranes 

43 



wo 2005/005597 



PCT/US2003/027106 



(Shorter et al., 2002). Rlieumatic disease-associated sequences can possess or interact 
with goIgin-97, RanBP2alpha, Imhlp, and p230/golgin (GRIP) domains, which are 
found in.many large coiled-coil proteins, are sufficient for targeting to the Golgi, and 
have a conserved tyrosine residue (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=GRIP). 

Disintegrin-Related Sequences 

[0152] Disintegrins are proteins that interfere with the function of 
integrins. Disintegrijis axe generally proteins of about 70 amino acid residues that 
contain multiple disulfide bonds, bind with high affinity to a subset of integrins, and 
interfere with integrin binding to physiological ligands. Examples of disintegrin- 
related sequences include snake venoms and related proteins, cysteine-rich 
metalloproteinases and related non-en2ymatic sequences, e.g., those expressed in the 
male reproductive tract, and membrane-anchored metalloproteinases with diverse 
functions, e.g., the shedding of cell-surface proteins such as cytokines and cytokine 
receptors, and the conferring of asthma susceptibility (Van Eerdewegh et al., 2002; 
Perry etal., 1995). 

[0153] Disintegrin-related sequences can possess or interact with 
disintegrin domains, which contain an Arg-Gly-Asp sequence, a sequence commonly 
found in adhesion proteins (http://pfam.wustl.edu/cgi-bin/getdesc?name=disintegrin). 
Proteins that comprise both disintegrin and metalloproteinase peptidase domains 
include ADAM proteins. Disintegrin-related sequences can also possess or interact 
with reprolysin family propeptide (Pep_M12B__propep) domains, which are domains 
that include the propeptide sequence of members of the peptidase family M12B, and 
contain a sequence motif similar to a sequence foimd in matrixin proteins 
(http://pfam.wustl.edu/ cgi-bin/getdesc?name=Pep_Ml 2B_ propep). 

Factor-Related Sequences 

[0154] A factor is any molecule that contributes to a bodily process. Factors 
can ftinction in specific biochemical reactions and cellular functions. There are many 
categories of factors, and factors are involved in many, if not all, physiological and 
pathological processes. Some exemplary factors are described in the following 
paragraphs; they are not exhaustive of the category. 

[0155] Transcription factors are factors that initiate or regulate transcription 
in eukaryotes. They include gene regulatory proteins, which turn specific sets of 
genes on or off, and general transcription factors, which assemble at the promoter 
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region to enable and regulate transcription of many genes. They also include 
transcription elongation factors, which are proteins required for the addition of amino 
acids to growing polypeptide chains on ribosomes (Alberts et al., 1994). 
Transcription factors interact with a wide variety of molecules, including DNA 
binding proteins, polymerases, regulatory molecules such as kinases, and specific 
regions of DNA, e.g., promoters, and enhancers (Alberts et al., 1994; Vallejo et al., 

1993) . 

[0156] Translation factors, including translation initiation factors and release 
factors, are involved in initiating and regulating the rate of protein synthesis. They 
also mteract with many molecules, including ribosomal proteins, mRNA, and 
molecules that regulate the incorporation of amino acids into protein, such as kinases 
and GTP (Price et al., 1 993; Alberts, 1 994). 

[0157] Export factors are involved in the export of molecules, e.g., RNA, 
from the nucleus (Stutz et al., 2000). Folding factors axe involved in the process of 
folding proteins into ttieir functional three dimensional shapes, and are also involved 
in receptor function (Gao et al., 1994). Factors such as activators and coactivators 
interact with nuclear receptors to modulate cellular processes, e.g., transcription 
(Mahajan et al., 2002). 

[0158] ADP-ribosylation factors are involved in the addition of an ADP- 
ribose group donated from nicotinamide adenine dinucleotide (NAD) to specific 
amino acid residues in heterotrimeric G-proteins. They are involved in, for example, 
normal cellular processes, such as vesicular transport, and also in the pathologic states 
induced by cholera, pertussis, and botulinum toxins (Alberts et al., 1994; Amor et al., 

1994) . Guanine nucleotide exchange factors bind to small G-proteins, such as Ras, 
and displace GDP in fevor of GTP. They act as effectors or modulators of small G- 
proteins (Ehrhardt et al., 2001; Janeway et al., 2001 ; Shao and Andres, 2000). 

[0159] Factor-related sequences can possess or interact with ADP- 
ribosylation factor family (arf) domains, which are GTP-binding domains involved in 
protein trafficking (http://pfam.wustl.edu/cgi-bin/getdesc?name=arf). Factor-related 
sequences can also possess or interact with elongation factor Tu GTP binding 
(GTP_EFTU) domains, which are elongation factors that promote the GTP-dependent 
binding of aminoacyl tRNA to ribosomes during protein biosynthesis, and catalyze 
the translocation of the newly synthesised protein chain (http://pfam.wustl.edu/cgi- 
biii/getdesc?name=GTP_EFTU). Factor-related sequences can also possess or 
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intCTact with 4F5 protein family (4F5 ) domains, which comprise ubiquitously 
expressed short proteins rich in aspartate, glutamate, lysine and arginine 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=4F5). Factor-related sequences can also 
possess or interact with eukaryotic initiation factors, e.g., eukaryotic initiation factor 
4E (IF4E), which recognizes and binds mRNA during an early step of protein 
synthesis (http://pfam.wustl.edu/cgi-bin/getdesc?name=IF4E). 
Germ Cell SpecilSc Protein-Related Sequences 

[0160] , Germ cells, also called gametes, are cells that contribute to a 
new generation of organisms by giving rise to either an egg or a sperm. They are 
haploid cells speciahzed for sexual fusion. Proteins that are specific to germ cells can 
be found at one or more developmental stages of gametes. 

[0161] Germ cell-related sequences include germ cell genes and their 
gene products, their regulators and effectors, genes and gene products affected in 
disorders associated wiith germ cells, and antibodies that specifically recognize or 
modulate germ cell-related sequences. Examples of germ cell-related sequences 
include the germ cell-specific Y-box binding protein and contrin. Germ cell specific 
protein-related sequences possess or interact with the cold-shock DNA-binding (CSD) 
domain, which is described above. 

Growth Factor-Related Sequences 

[0 1 62] A growth factor is an extracellular polypeptide signaling molecule 
that stimulates a cell to grow or proUferate. Many types of growth factors exist, 
including protein hormones and steroid hormones. Some growth factors have a broad 
specificity, and some have a narrow specificity. Examples of growth factors with 
broad specificity include platelet-derived growth factor, epidermal growth factor, 
insulin like growth factor I, transforming growth factor p, and fibroblast growth 
factor, which act on many classes of cells. Examples of growth factors with narrow 
specificity include erytliropoeitin, which induces proliferation of precursors of red 
blood cells, interleukin-2, which stimulates proHferation of activated T-lymphocytes, 
interleukin-3, which stimulates proliferation and survival of various types of blood 
cell precursors, and nerve growth factor, which promotes the survival and the 
outgrowth of nerve processes fi"om specific classes of neurons. 

[01 63] Most growth factors have other actions in addition to inducing cell 
growth or proliferation, e.g., they may influence survival, differentiation, migration, 
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or other cellular functions. Growth factors can have complex effects on their tai^ets, 
e.g., they may act on some cells to stimulate cell division, and on others to inhibit it. 
They may stimulate growth at one concentration, and inhibit it an another. Growth 
factors are also involved in tumorogenesis. 

[0164] Growth factor related sequences include sequences associated with 
the process of stimulating cell growth or proliferation by a growth factor. For 
example, they include intracellular effectors of growth, such as components of 
intracellular pathways that respond to growth factors (Kothapalli et al., 1997; Wax et 
al, 1994), sequences that bind directly or indirectly to growth fectors (Van den 
Berghe et al., 2000), and sequences affected as a result of growth factor action. 

[0165] Growth factor-related sequences can possess or interact with a 
transforming growth factor beta like (TGF-beta) domain, which is a multifunctional 
peptide sequence that controls proliferation, differentiation and other functions in 
many cell types (http://pfam. wustl.edu/cgi-bin/getdesc?name=TGF-beta). Growth 
factor-related sequences can also possess or interact with a fibroblast growth factor 
(FGF) domain, which is found in a family of proteins involved in growth and 
differentiation (http://pfam.wustl.edu/cgi-biii/getdesc? name=FGF). 

GTPase-Related Sequences 

[01 66] GTPases are enzymes that catalyze OTP hydrolysi s, and 
comprise a large family of proteins with a similar globular GTP binding domain. 
When GTP is bound to a GTPase, it is hydrolyzed to GDP, and the domain undergoes 
a conformational change that inactivates the protein. GTPases are regulated by 
GTPase regulators, proteins that determine whether a GTP binding protein exists in a 
GTP-bound or GDP-bound state (Zhao et al., 2003). GTPase regulators include 
GTPase activating proteins, which bind the GTPase and induce it to hydrolyze its 
bound GTP to GDP; the GTPase remains in an inactive, GDP-bound state until it 
encounters a guanine nucleotide releasing protein, which binds to the GTPase and 
causes the release of the nucleotide. GTPases have abroad spectrum of intiracellular 
functions, including intracellular vesicular transport. Examples of GTPase-related 
sequences include ras, GTPase-activating proteins, and guanine nucleotide releasing 
proteins, 

[0167] GTPase-related sequences can possess or interact with GTPase 
activator protein for Ras-like GTPase (RasGAP) domains, which are protein domains 
of about 250 residues that accelerate the GTPase activity of ras 
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(http://pfam.wustl.edu/cgi-bin/getdesc?name=RasGAP). GTPase-related sequences 
can also possess or interact with putative GTPase activating protein for ARF (ArfGap) 
domains, which are protein domains with a zinc finger involved in intermolecular 
associations (http://pfam.wustl.edu/cgi-bin/getdesc?name=ArfGap). GTPase-related 
sequences can also possess or interact with ankyrin repeat domains (ank), which are 
tandemly repeated modules of about 33 amino acids found in a variety of functionally 
diverse proteins (http://pfam.wustl.edu/cgi-bin/getdesc?name=ank). GTPase-related 
sequences can also ppssess or interact with pleckstrin homology (PH) domains, which 
are protein domains of about 100 residues involved in intracellular signaling, or as 
components of the cytoskeleton (http://pfam.wustl.edu/cgi-bin/getdesc?name=PH). 
Heat-Shock Protein-Related Sequences 

[0168] Heat-shock proteins, also referred to as stress-response proteins, are 
proteins that are synthesized in response to an elevated temperature or other cell 
stressor, and help the cell withstand environmental insults. A cell stressor can induce 
a battery of genes that encode gene products that protect the cell fiom the result of the 
insult, e.g., proteins that stabilize and repair partially denatured cell proteins. Some 
heat-shock proteins, e.g., chaperones, are present at high levels in unstressed cells, 
and further induced by stress. Chaperones assist other proteins in attaining their 
proper secondary and tertiary structures. For example, members of the tubulin- 
specific chaperone A family possess tubulin-specific chaperone A (TBCA) domains 
that fold tubulin polypeptides into their functional configuration 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=TBCA), 

[01 69] Heat and other stressors further induce the synthesis of a family of 
90-kDa heat-shock proteins that are akeady abundant in unstressed cells (Pepin et al., 
2001;Lees-Miller et al., 1989; Rebbe et al., 1987). Members of this family possess a 
hsp 90 protein (HSP90) domain that mteracts with tubulin, actin, tyrosine kinase 
oncogene products of retroviruses, eIF2alpha kinase, and steroid hormone receptors 
(Lees-Miller and Anderson, 1 989). This domain includes a highly-conserved N- 
terminal region, separated from a conserved, acidic C-terminal region by a highly- 
acidic, flexible linker region (http://pfam. ^vustl.edu/cgi-bin/getdesc?name=HSP90). 

[0170] Another family of heat-shock proteins, the hsp70 proteins, have an 
average molecular weight of 70 kDa; some members of this family are only expressed 
under conditions of stress, while some are present in cells under normal conditions. 
Hsp70 proteins reside in different cellular compartments, e.g., tlie nucleus, cytosol, 
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mitochondria, and endoplasmic reticulum. Hsp70 proteins, e.g., Hsc73, can be 
differentially expressed at different stages of development (Soulier et al., 1996). 
Hsp70 proteins, e.g., the chaperone hsp70-like dnaK protein, can associate with 
proteins that possess a DnaJ domain, which comprises an N-terminal conserved 
domain of about 70 amino acids, a glycine-rich region of about 30 amino acids, a 
central domain containing four repeats of a CXXCXGXG motif, and a C-terminal 
region of 1 20 to 1 70 amino acids (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=i>naJ). Proteins with DnaJ domains can be postranslationally modified by 
famesylation (Andres et al., 1997). 
Helicase-Related Sequences 

[0171] Helicases are enzymes that use energy from the hydrolysis of 

ATP to unwind the DNA helix at the replication fork, allowing the single stands to be 
copied. Proteins with DNA helicase activity play roles in DNA replication, repair, 
and recombination. Disorders associated with helicases include Xeroderma 
pigmentosmn, Cockayne syndrome, diffuse collagen disease, alpha-thalassemia, 
Bloom syndrome, Werner syndrome, and Rothmund-Thomson syndrome (Miyajima, 
2002). Examples ofhelicases include RNA helicases, RECQL4, and 
minichromosome maintenance helicase. 

[0 1 72] HeUcase-related sequences can possess or interact with helicase 
associated (HA) domains, which are protein domains comprising alpha heUces that 
may bind to nucleic acids (http://pfam.wustl.edu/cgi-bin/getdesc?name=HA). 
Helicase-related sequences can also possess or interact with helicase conserved C- 
terminal (helicase_C) domains, which are protein domains that are found in a subset 
ofhelicases designated the DEAD/H helicases (http://pfam.wustl.edu/ cgi- 
bin/getdesc?name=helicase_C). 

Hydrolase-Related Sequences 

[0173] Hydrolases are enzymes that catalyze the hydrolysis of a variety 
of bonds, such as esters, glycosides, and peptides. Hydrolases split a molecule into 
fragments by adding water; the water's hydrogen atom is incorpprated into one 
fragment, and the hydroxyl group is incorporated into another. Hydrolases are 
involved in a wide range of physiological and pathological processes, including 
proteolysis, phosphatase activity, and sugar metabolism. Examples of hydrolases 
include protein hydrolases, lipid hydrolases, nucleic acid hydrolases, and small 
molecule, e.g., coenzyme A, hydrolases (Hawes et al, 1996). 
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[0174] Hydrolase-related sequences can possess or interact with 

alpha^eta hydrolase fold (abhydrolase) domains, which are catalytic domains found 
in a wide range of hydrolytic enzymes of different phylogenetic origins and catalytic 
ftinctions (http ://pfam. wustl.edu/cgi -bin/getdesc?name=abhydrolase). Hydrolase- 
related sequences can also possess or interact with dUTPase domains, which are 
proteins domains that hydrolyze dUTP to dUMP and pyrophosphate. 

Immune CeU-Related Sequences 

[0 1 75] An immune cell is a cell involved in, or associated with, the immune 
system. Immime cells include cells in the myeloid and lymphocytic arms of the 
immune response, as well as their precursors. Immune cells also include cells at all 
stages in the differentiation pathways that produce cells associated with the immune 
system. These cells can reside, either permanently or temporarily, in the spleen, 
lymph nodes or mucosal-associated lymphoid tissues (MALT). Immune cell-related 
sequences are involved in all functions of the immime response, e.g., antibody 
production and cell-mediated immunity, and can function at any point in time, ranging 
from the embryonic formation of the immune system, tlirough the time of an immune 
challenge, to many decades later, e.g., when a B-cell memory response is invoked 
(Janeway, 2001). 

[01 76] Immune-cell related sequences of differentiating immune cells 
include pre-B cells that do not produce immunoglobulin light chain, but express a 
transcript homologous to immunoglobulin lambda light-cham genes, the expression of 
which is limited to pre-B cells and select other cells that have no surface 
immunoglobulin (HoUis et al., 1989). Immune-cell related sequences of activated 
immune cells include a B-cell-restricted transcription factor expressed by activated B 
cells; its expression pattern suggests it has a role in regulating B-cell differentiation 
(Massari et al., 1998). 

[0 1 77] Examination of the expression of immune-cell related sequences can 
detect and diagnose immunoregulatory abnormalities. For example, genes that 
encode proteins which mediate the combinatorial process that combines a fmite 
number of component genes into the very broad range of antigen-specific 
immunoglobulin and T-cell binding proteins, are expressed at higher levels in patients 
with systemic lupus erythematosis (SLE) than in healthy subjects (Girschick et al., 
2002). 
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[0178] Immune cell-related sequences can possess or interact with a CUB 
domain, which is an extracellular domain of approximately 1 1 0 amino acids, and is 
present in functionally diverse, including developmentally regulated, proteins 
(http://pfam.wustl.edu/ cgi-bin/getdesc?name=CUB). Immune cell-related sequences 
can also possess or interact with a CD-20 domain, which has four transmembrane 
regions, both extracellular and cytoplasmic extensions, and is found, inter alia, in a 
high affinity IgE receptor (http://pfam.wustl.edu/cgi-bin/getdesc?name=CD20). 
Immune cell-related sequences can also possess or interact with an interferon-induced 
transmembrane protein (CD225) domain, which is found in a family of proteins that 
includes the human leukocyte antigen CD225, an interferon-inducible transmembrane 
protein associated with interferon-induced cell growth suppression (http://pfam.wustl. 
edu/cgi-bin/getdesc?nameN::D225). Immune cell-related sequences can also possess 
or interact with sushi domains, also known as complement control protein (CCP) 
modules, or short consensus repeats (SCR). These domains are found in a wide 
variety of complement and adhesion proteins, including proteins responsible for the 
antigenicity of blood group antigens on the external face of the red blood cell 
membrane (http://pfam.wustl.edu/cgi-bin/getdesc?name=sushi). Immune cell-related 
sequences can also possess or interact with SH2 domains and rvt domains; both are 
described above. 

Integrase-Related Sequences 

[0179] Integrases are enzymes that form proviruses by inserting a linear 
double-stranded DNA copy of a retroviral genome into host cell DNA. Examples of 
integrases include HIV integrase, PhiC31 integrase, and Sip. 

[0180] Integrase-related sequences can possess or interact with an 
integrase 2dnc binding domain (Integrase Zn) domain, which is a zmc binding protein 
domam placed near the N-tenninus (http://pfem.wustl.edu/cgi-bin/getdesc? 
name=Integrase_Zn). Integrase-related sequences can also possess or interact with an 
integrase core (rve) domain, which is a protein domain that forms the central catalytic 
core of the integrase (http://pfam.wustl.edu/ cgi-bin/getdesc?name=rve). This domain 
acts as an endonuclease to cleave the nucleotide and catalyzes the transfer of the viral 
DNA strand to the integration site of the host DNA. Integrase-related sequences also 
possess or interact with an integrase DNA binding (integrase) domain, which is a 
DNA-binding protein domain near the C-terminus (http://pfam.wustl.edu/cgi- 
bin/getdesc?nfflne=integrase). Integrase-related sequences also possess or interact 
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reverse transcriptase (rvt) domains, which are described above. Integrase-related 
sequences also possess or interact with a RNase H domain, which is a protein domain 
that hydiolyzes the KNTA portion of RNA/DNA hybrids (http://pfam.wustl.edu/cgi- 
biD/getdesc?name=maseH). 

Integrin-Related Sequences 

[01 81] hitegrins are transmembrane proteins that mediate cell to cell as 
well as cell to matrix adhesion, and provide a means of communication between the 
interior of a cell and ,the extracellular matrix. The extracellular portion of integrins 
binds to components of the extracellular matrix, e.g., collagen, fibronectin and 
laminin. The intracellular portion of integrins interacts with the cell cytoskeleton, 
e.g., actm filaments near the cell surface. Integrins transmit information about the 
extracellular environment across the plasma membrane to the cytoskeleton, where it is 
available to intracellular signaling mechanisms (Alberts et al,, 1994). Structurally, 
integrins consist of heterodimers of an alpha and a beta subunit. Each subimit has a 
large N-terminal extracellular domain followed by a transmembrane domain and a 
short C-terminal cytoplasmic region. The pairing of certain alpha subunits with 
certain beta-subunits determines ligand specificity, localization and function. The 
extracellular binding domains of integrins often bind their ligands with low affmity; 
simultaneous, weak, binding with multiple matrix molecules provides the cell with a 
means to sense its complex, changing, extracellular environment without becoming 
glued to it. Examples of integrin-related sequences include integrin alpha and beta 
subunits, collagens, and integrin-liiiked kinase (Zhang et al., 2002). 

[0182] Integrin-related sequences can possess or interact with von 

Willebrand factor type A (vwa) domains, which are protein domains that participate 
in diverse biological functions, e.g., cell adhesion, migration, homing, pattern 
formation, and signal transduction (http://pfem.wustl. edu/cgi-bin/getdesc? 
name=vwa). Integrin-related sequences can also possess or interact with FG-GAP 
repeat (FG-GAP) domains, which are protein domains present in the vicinity of ligand 
binding domains at the N-terminus of integrin alpha subunits (http://pfam.wusti.edu/ 
cgi-bin/ getdesc?name=FG-GAP). 

Interacting Protein-Related Sequences 

[01 83] An "interacting protein" is a protein that interacts with another 
molecule. Interacting proteins are involved in every aspect of cellular function. 
Interacting proteins have been characterized in all known locations in the cell, and 
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include all, or most types of, proteins. Interacting proteins in the nucleus regulate 
such diverse functions as apoptosis, transcription, homologous recombination, and 
DNA repair. Nuclear fibroblast growtli factor-2 interacting factor interacts with 
fibroblast growth factor 2 to prevent apoptosis (Van den Berghe et al., 2000). Grap2 
cyclin-D interacting protein (GCIP) a nuclear cell-cycle protein, inhibits select 
transcriptional events, and reduces the leve 1 of phosphorylation of nuclear 
retinoblastoma protein (Chang et al., 2000). Pir 51, a human homologue of Rec A, a 
bacterial enzyme that mediates genetic recombination, interacts with the enzyme 
radSl to regulate homologous recombination and DNA repair in mammalian cells 
(Kovalenko et al., 1997). Hepatitis B virus X-associated protein (HBXAP), a protein 
demonstrated to play a role in the development of hepatocelluar carcinoma, interacts 
with the hepatitis B virus regulatory gene product HBx to increase viral transcription 
(Shamay et al, 2002). 

[0184] Interacting protein-related proteins can utilize many protein domain 
motifs for interaction. They can possess or interact with domains that mediate 
interaction with DNA, RNA, ions, or other proteins. For example, PDZ domains, 
which are also loiown as DHR or GLGF domains, target signaling molecules to 
membranes and mediate the assembly of functional membrane domains (Fanning and 
Anderson, 1 999). Interacting protein-related proteuis can also possess or interact with 
rrm domains, which are described above. 

Isomerase-Related Sequences 

[0185] Isomerases are enzymes that convert molecules into their 
positional isomers, i.e., into molecules with the same chemical formula but a different 
stereochemical arrangement of atoms. Isomerases act on a wide variety of molecules, 
including sugars, amino acids, and nucleic acids. They are involved in a wide range 
of physiological and pathological functions, including those involving metaboUc and 
synthetic pathways. 

[0186] Isomerase-related sequences include isomerase genes and gene 

products, their substrates, products, activators, inhibitors, effectors, and cofactors, 
regulatory molecules that modulate their function, genes and gene products affected in 
disorders associated with isomerases and antibodies that specifically recognize or 
modulate isomerase-related sequences. Examples of isomerase-related sequences 
include triosephosphate isomerases, peptidyl-prolyl isomerases, glucose phosphate 
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isomerases, disulfide isomerases, ketosteroid isomerases, and ribosyltransferase- 
isomerases (Brown et al., 1985). 

[01 87] Isomerase-related sequences can possess or interact with 

triosephosphate isomerase (TIM) domains, which are protein domains that catalyze 
the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone 
phosphate (http ://pfam.wustl.edu/cgi-bin/getdesc?name=TIM). Isomerase-related 
sequences can also possess or interact with cyclophilin type peptidyl-prolyl cis-trans 
isomerase (pro_ison;erase) domams, which accelerate protein folding by catalyzing 
the cis-trans isomerization of peptide bonds (http://pfam.wiistl.edi]/ 
cgibin/getdesc?name=pro_ isomerase). 

Mucin-Rdated Sequences 

[01 88] The term mucin refers to both an albumin-like substance that is 

present in mucus, and to transmembrane proteins that can typically be produced in 
both soluble and transmembrane forms. Soluble mucins comprise mucus gels that 
protect epithelial cells in the airways, digestive tract, and other organs, and are found 
in body fluids, such as milk, tears, and saliva. In their transmembrane forms, mucins 
provide a steric barrier to protect the apical surface of epithelial cells. 
Transmembrane mucins are also involved in pathogenesis; for example, they mediate 
viral entry into cells, promulgate the inflammatory response, and are involved in the 
regulation of abnormal cell proliferation (Jeffery and Zhu, 2002; Tsuda et al., 1993). 
Examples of mucins include MUC2 mucin, mucin carcinoembryonic antigen, and 
Muc3 membrane bound intestinal mucin. 

[0 1 89] Mucin -related sequences can possess or interact with mucin-like 
glycoprotein (tryp_mucin) domains, which are domains that are involved in the 
interaction of parasites with host cells (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Tryp_mucin). Mucin-related sequences can also possess or 
interact with multi-glycosylated core protein (MGC-24) domains, which are protein 
domains of sialomucins that are expressed in many normal and cancerous tissues 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=MGC-24), 

Other Polypeptide-Related Sequences 

[01 90] In addition to the sequences described above, the sequences of 

the invention include nucleotide and amino acid sequences, some with known 
function, and some with unknown function, that fall into a broad array of categories. 
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[0191] Polypeptide-related sequences of the invention can possess or 
interact with groucho/TLE N-temiinal Q-rich (TLE_N) domains, which are protein 
domains found in co-repressor proteins, and are involved in oligomerization 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=TLE_N). Polypeptide-related 
sequences of the invention can also possess or interact with uncharacterized protein 
family 0160 (UPF0160) domains, which are protein domains found in proteins that 
include multiple metal-binding residues, and in some cases act as a phosphodiesterase 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=UPF01 60). Polypeptide-related 
sequences of the im^ention can also possess or interact with SNF7 domains, which axe 
protein domains involved in protein sorting and transport from the endosome to the 
lysosome or vacuole of eucaryotic cells (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=SNF7). Polypeptide-related sequences of the inventioii can also possess or 
interact with NifU-like N-terminal (NifU_N) domains, which are protein domains 
involved in nitrogen fixation, and other functions (http://pfam.'wustl.edu/cgi- 
bin/getdesc? name=NifU_N). Polypeptide-related sequences of the invention can also 
possess or interact with tRNA synthetases class II (D, K, andN) (tRNA-synt_2) 
domains, which are protein domains that activate the amino acids asparagines, 
aspartic acid, and lysine, and transfer them to specific tRNA molecules 
(http://pfam.wustl.edu/cgi-biii/getdesc?name=tRNA-synt_2). 

[0192] Polypeptide-related sequences of tlie invention can also possess 
or interact with dynein heavy chain (dynein_heavy) domains, which are protein 
domauis that correspond to the C-terminal region of the dynein heavy chain 
(http://pfam.wustl.edu/cgi-bin/getdesc?name^Dynehi_heavy). Polypeptide-related 
sequences of the invention can also possess or interact with cyclin-dependent kinase 
regulatory subunit (CKS) domains, which are protein domains of approximately 79- 
150 amino acid residues that are involved in regulating progression through the cell 
cycle (http://pfam.wustl.edu/cgi-bin/getdesc?name= CKS). 

[0193] Polypeptide-related sequences of the invention can also possess 
or interact with nucleoside diphosphate linked to some other moiety X (NUDIX) 
domains, which are protein domains that are involved in removing oxidatively 
damaged nucleotides (http://pfam.wustl.edu/cgi-bin/getdesc?name=NUDIX). 
Polypeptide-related sequences of the invention can also possess or interact with T- 
complex protein/cpn60 chaperonin (cpn60_TCPl) domains, which are protein 
domains involved in protein folding and oUgomerization (http://pfam.wustl.edu/cgi- 
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bin/getdesc?name=cpn60_TCPl). Polypeptide-related sequences of the invention can 
also possess or interact with F-actin capping protein, beta subunit (F_actin_cap_B) 
domains, which are protein domains of approximately 280 amino acids that are 
involved in capping actin, i.e., blocking the exchange of actin monomers (http://pfam. 
wustl.edu/cgi-bin/getdesc?name=F_actin_cap_B). 

[0194] Polypeptide-related sequences of the invention can also possess 
or interact with G-protein alplia subunit (G-alpha) domains, which are protein 
domains that bind guanyl nucleotides, and function as a GTPase (http://pfam.wustl. 
edu/cgi-bin/getdesc? name^G-alpha). Polypeptide-related sequences of the invention 
can also possess or interact with Kruppel-associated box (KRAB) domains, which are 
protein domains involved in protein-protein interactions, and present in some zinc 
finger proteins (http://pfam.AVUstl.edu/ cgi-bin/getdesc?name=KRAB). Polypeptide- 
related sequences of the invention can also possess or interact with metallopeptidase 
family M24 (Peptidase^M24) domains, which are protein domains that are found in 
some metalloproteases, including proline dipeptidase, and methionine aminopeptidase 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Peptidase_M24). Polypeptide-related 
sequences of the invention can also possess or interact with thioredoxin (thiored) 
domains,. which are protein domains involved in oxidation/reduction reactions by 
reversibly oxidizing disulfide bonds (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=thiored). 

[01 95] Polypeptide-related sequences of the invention can also possess 
or interact with TUDOR domains, which are protein domains involved in the 
formation of primordial germ cells, and for normal abdominal segmentation 
(http://pfam.wustl.edu/cgi-bin/getdesc7name =TUDOR). Polypeptide-related 
sequences of the invention can also possess or interact with SIT4 phosphatase- 
associated protein (SAPS) domains, which are protein domains that are involved in 
cyclin transcription (http://pfam.wustl.edu/cgi-bin/getdesc?name=SAPS). 
Polypeptide-related sequences of the invention can also possess or interact with 
ankyrin repeat (ank) domains, which are protein domains of approximately 33 amino 
acids, and are sometimes found in tandemly repeated modules (http://pfam.wustl.edu/ 
cgi-bin/getdesc? name=ank). Polypeptide-related sequences of the invention can also 
possess or interact with nicotinamide N-methyltransferase/phenylethanolamine N- 
methyltransferase/ thioether S-methyltransferase (NNMT_PNMT_TEMT) domains, 
which are protein domains that are found in proteins that use S-adenosyl-L- 
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methionine as the methyl donor (http://pfam.wustl.edu/cgi-bin/getdesc?name= 
NNMT_PNMT_TEMT). Polypeptide-related sequences of the invention can also 
possess or interact with CI q domains, which are protein domains involved in 
activating the serum complement system (http://pfam.wustl.edu/cgi-bin/getdesc? 
name=Clq). Polypeptide-related sequences of the invention can also possess or 
interact with collagen triple helix repeat (Collagen) domains, which are protein, 
domains that typically fonn extracellular connective tissue (http://pfam.wustl.edu/cgi- 
bin/getdesc? name=Collagen). 

[0196] Polypeptide-related sequences of the invention can also possess 
or interact with the hyaluronan/mRNA binding family (HABP4_PAI-RBP1) domain, 
which is a protein domain that can bind to the glucosaminoglycan hyaluronan, and to 
RNA (http://pfam.wustl.edu/cgi-bin/getdesc?name=HABP4_PAI-RBPl). 
Polypeptide-related sequences of the invention can also possess or interact with 
eucaryotic aspartyl protease (asp) domains, which are protein domains that cleave 
peptide bonds; proteins with tliis domain include pepsins, cathepsins, and rennin 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=asp). Polypeptide-related sequences of 
the invention can also possess or interact with ti-ypsin domains, which are protein 
domains that function as serine proteases (http://pfam.wustl.edu/ cgi-bin/getdesc? 
name=trypsin). Polypeptide-related sequences of the invention can also possess or 
interact with Kunitz/Bovine pancreatic trypsin inhibitor (Kunitz BPTI) domains, 
which are protein domains that is found in serine protease inhibitors (http://pfam. 
wustl.edu/cgi-bin/getdesc?name=Kumtz_BPTI). Polypeptide-related sequences of the 
invention can also possess or interact with proliferating cell nuclear antigen, N- 
terminal (PCNA) domains, which are protein domains that are found on non-histone 
acidic nuclear proteins, and play a role in controlling DNA replication (http://pfam. 
wustl.edu/cgi-bin/getdesc?name=PCNA). 

Oxygenase-Related Sequences 

[0 1 97] Oxygenases are enzymes that catalyze the incorporation of 

molecular oxygen into organic substances. Dioxygenases, also known as oxygen 
transferases, catalyze the introduction of both atoms of molecular oxygen, and 
typically contain iron. Monooxygenases, also known as mixed function oxygenases, 
introduce one oxygen atom; the other is reduced to water. Examples of oxygenase- 
related sequences include cytochrome oxygenases, heme oxygenases, 
cyclooxygenases, lipoxygenases, and peptide-aspartate beta-dioxygenase. 
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[0 1 98] Oxygenase-related sequences can possess or interact with alkyl 

hydroperoxide reductase/thiol specific antioxidant (AhpC-TSA) domains, which are 
responsible for providing a defense against sulftir-containing radicals; proteins that 
possess this domain include allergens, e.g., asp f 3, mal f 2, and mal f 3 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=AhpC-TSA). Oxygenase-related 
sequences can also possess or interact with monooxygenase domains, which are 
protein domains that utilize flavin adenine dinucleotide (FAD) (http://pfam.wustl. 
edu/cgi-biii/getdesc?name=Monooxygenase). Oxygenase-related sequences can also 
possess or interact with dioxygenase domains, which are protein domains that 
catalyze the incorporation of both atoms of molecular oxygen into substrates 
(http://pfam.wustl.edu/cgi-bin/getdesc?name= Dioxygenase). 

Peroxidase-Related Sequences 

[0199] Peroxidases are enzymes that catalyze the reduction of 

hydrogen peroxide. Peroxidases are generally located within peroxisomes, which are 
intracellular organelles that metabolize fatty acids and toxic compounds. Disorders 

associated with peroxidase-related sequences include X-lihked adrenoleukodystrophy. 
Examples of peroxidase-related sequences include glutathione peroxidases,' thiol 
peroxidases, catalases, horseradish peroxidases, anionic peroxidases, and thyroid 
peroxidases. 

[0200] Peroxidase-related sequences can possess or interact with alkyl 
hydroperoxide reductase/thiol specific antioxidant (AhpC-TSA) domains, which are 
protein domains that can reduce organic hydroperoxides (http://pfam.wustl.edu/cgi- 
bin/getdesc? name=AhpC-TSA). 

PhosphoUpase-Related Sequences 

[0201] Phospholipases are enzymes that act on phospholipids. They 
characteristically generate products that are active in signal transduction pathways. 
For example, phospholipase C hydrplyzes phosphatidylinositol bisphosphate (PIP2) to 
generate the two intracellular mediators, inositol trisphosphate (IP3) and 
diacylglycerol. IP3 releases "Ca^"^ firom stores in the endoplasmic reticulum, increasing 
the cytosolic Ca^* concentration. Diacylglycerol remains in the plasma membrane 
and activates protein kinase C. 

[0202] Phospholipase activity is involved in the synthesis of eicosanoids, 
inflammatory mediators that include prostaglandins, prostacyclins, thromboxanes, and 
leukotrienes. Corticosteroid hormones, such as cortisone, for example, inhibit 
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phospholipase activity in the first step of the eicosanoid synthesis pathway. 
Corticosteroid hormones are widely used clinically to treat noninfectious 
inflammatory diseases, such as some forms of arthritis (Ribardo et al., 2002). 

[0203] Phospholipids play a pivotal role in the modulation of intestinal 
inflammation. The mucosal surface of the digestive tract functions as a regulatory 
barrier between the gastrointestinal lumen and the underlying mucosal immune 
system. Phospholipids help preserve the mucosa following various forms of injury or 
physiological damage to the lumen, thus preventing invasion of harmful limiinal 
factors into the host, which subsequently may lead to inflammation, or a pathological 
immune response, both promoting and inhibiting gastrointestinal inflammation and 
immunity (Sturm and Dignass, 2002). 

[0204] Phospholipase-related sequences can possess or interact with 
lysophospholipase catalytic (PLA2_B) domains, which catalyze the release of fatty 
cids from lysophospholipids (http://pfam.wustl.edu/cgi-bin/getdesc?name=PLA2_B). 
Phospholipase-related sequences can also possess or interact with 
phospholipase/carboxylesterase (abhydrolase_2) domains, which have broad substrate 
specificity (http://pfam.wustLedu/cgi-bin/getdesc?name=abhydrolase_2). 
Phospholipase-related sequences can also possess or interact with GDSL-like 
lipase/acylhydrolase (Lipase GDSL) domains, which are present in lipolytic enzymes 
with serine in the active site (http://pfam.wustl.edu/cgi-bin/getdesc?name= 
Lipase_GDSL). 

Prosaposin-Related Sequences 

[0205] Saposins are small lysosomal proteins that activate lysosomal 

lipid-degrading enzymes, including en2ymes that metabolize sphingosine. They 
typically isolate lipids from their membrane surroundings, and increase their 
accessibility to degradative enzymes. Mammalian saposins are synthesized as a 
single precursor molecule, prosaposin, which becomes an active saposin following 
proteolytic activation. Exmnples of prosaposin-related sequences include saposin A, 
saposin B, and saposin C. Disorders associated with prosaposin-related sequences 
include neurodegenerative diseases similar to similar to Tay-Sachs and Sandhoff 
diseases, e.g., Gaucher's disease, which is described above. 

[0206] Prosaposin-related sequences can possess or interact with 

saposin-A (SAP A) domains, saposin Bl (SapB_l) domains, and saposin B2 (SapB_2) 
domains, which are described above. 
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Proteasome-Related Sequences 

[0207] Proteasomes are intracellular complexes that degrade proteins. 
Proteasomes recognize proteins that have been.marked for destruction by the addition 
of an ubiquitin molecule, unfold these ubiquitinated proteins, cleave them into small 
peptides of 642 amino acids, and release them into the cytosol (Mitch and Goldberg, 
1996). Examples of proteasome-related sequences include 26S proteasome subunits, 
26S proteasome regulatory chains, and ubiquitin. 

[0208] , Proteasome-related sequences can possess or interact with 
proteasome/cyclosome repeat (PC_rep) domains, which are protein domains that are 
present in regulatory subunits of the proteasome (http://pfam.wustl.edu/cgi- 
bin/getdesc?name= PC_rep). Proteasome-related sequences can also possess or 
interact with Mov34/MPN/PAD-l family (Mov34) domains, which are protein 
domains found at the N-terminus of regulatory subunits of the proteasome 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Mov34). 

Reductase-Related Sequences 

[0209] Reductases are enzymes that catalyze reduction reactions, i.e., 

reactions in which hydrogen is combined with a molecule, or reactions in which 
oxygen is removed from a molecule. Examples of reductases include dehydrogenase 
reductases, oxidoreductases, quinone reductases, CoA reductases, dihydrofolate 
reductases, tetrahydrofolate reductases, carbonyl reductases, nitrate reductases, 
epoxide reductases, NADP(+) reductases, ribonucleotide reductases, and thioredoxin 
reductases (Loeffen et al., 1998). 

[02 1 0] Reductase-related sequences can possess or interact with short 
chain dehydrogenase (adh_short) domains, which are present in a wide variety of 
proteins (http://pfam.wustl.edu/cgi-bin/getdesc?nameF=adh_short). Reductase-related 
sequences can possess or interact with NADH-Ubiquinone oxidoreductase (complex 
I), chain 5 N-terminus (oxidored_ql_N) domains, which are protein domains that 
catalyze the transfer of electrons from NADH to ubiquinone in a reaction that can be 
associated with proton translocation across a membrane (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=oxidored_ql_N). 

Reverse Transcriptase-Related Sequences 

[021 1] Reverse transcriptases are enzymes that make double stranded 
DNA copies from single sfranded nucleic acid template molecules. Typically, a 
reverse, transcriptase is a DNA polymerase that can copy both RNA and DNA 
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templates, and has an integral RNase H activity (Lim et al., 2002). The two 
enzymatic domains of reverse transcriptase reflect these two activities; the first is a 
DNA polymerase domain tliat can use either RNA or DNA as a template to synthesize 
either the minus-strand or the plus strand of DNA, and the second is an RNase H 
domain that degrades the RNA in RNA-DNA hybrids (Coffin, 1997; Wu and Gallo, 
1975). 

[02 1 2] Reverse transcriptase plays a role in the replication of some 
viruses, e.g., retroviruses. It copies the retroviral RNA genome to produce a single 
minus strand of DNA, then catalyzes the synthesis of a complementary plus strand. 
Accordingly, reverse transcriptase is a therapeutic target for conditions that involve 
retroviruses, e.g., Aquired Immune Deficiency Syndrome (AIDS). A number of anti- 
retroviral drugs inhibit reverse transcriptase (Frank, 2002). 

[021 3] Reverse transcriptase is also a standard scientific research tool in 
the field of molecular biology. The reverse transcriptase polymerase chain reaction 
(RTPCR) amplifies specific DNA sequences rapidly, and in vitro. RTPCR can detect 
trace amoimts of RNA and DNA, and is used in a wide range of applications, 
including forensics, the diagnosis of genetic diseases, determination of the prognosis 
of diagnosed diseases, and the detection of viral infection (Alberts, et al., 1994). For 
example, reverse transcriptase is used to diagnose cancer (Rowland, 2002), and to 
provide prognostic infoimation about the predicted survival of patients with prostate 
cancer (Kantoffet al., 2001). 

[0214] An example of a reverse transcriptase is telomerase, a general 
tumor marker with a reverse transcriptase catalytic subunit (Kirkpatrick and Mokbel, 
2001). Most human somatic cells do not express the telomerase reverse transcriptase 
gene; conversely, most cancer cells express this gene (Ducrest et al., 2002; Kyo et al., 
2000 ). The human telomerase reverse transcriptase promoter has been placed in gene 
therapy vectors that specifically target telomerase-positive tumor cells, and spare 
nearby telomerase-negative cells (Pan and Koeneman, 1999). Human telomerase 
reverse transcriptase is also recognized as a tumor antigen that can be a target for 
immunotherapeutic approaches to cancer (Gordan and Vonderheide, 2002). 

[02 1 5] Reverse transcriptase-related sequences can possess or interact 
with rvt, transposase_22, WD40, and Exo_endo_phos domains, all of which are 
described above. 
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Ribosome-Related Sequences 

[0216] A ribosome is a particle comprised of ribosomal proteins and 
ribosomal KNA that catalyzes protein synthesis from messenger RNA. Ribosomes 
are composed of two subunits, the large (L) subunit and the small (S) subunit. The 
typical mammalian ribosome comprises four RNA molecules and approximately 
eighty different proteins, which are highly conser\^ed among prokaryotes and 
eukaryotes, and perform a variety of tasks related to protein synthesis . e.g., 
coordinating protein synthesis in a maimer that maintains cell homeostasis 
(Yoshihama et al., 2002; Kemnochi at al., 1998). . 

[021 7] Ribosomal proteins can perform functions independent of their 
involvement in protein synthesis. For example, they are involved in cell-cycle 
progression, e.g., as cell cycle checkpoints, and mediators of homologous 
recombination, embryogenesis, and skeletal development (Yoshihama et al., 2002; 
Chen and loannou, 1999). They also contribute to the regulation of cell growth, 
transformation, and death, and can induce apoptosis (Chen and loannou, 1999; Naora 
et al., 1 999). Mutations in ribosomal proteins are associated with human diseases, 
including Down syndrome, Diamond-Blackfan anemia, Turner syndrome, and 
Noonan syndrome (Yoshihama et al., 2002). 

[02 1 8] Ribosomal proteins have been grouped into protein families on the 
basis of sequence similarities in functional domains. One family of ribosomal 
proteins, the ribosomal protein Lll, RNA binding CRibosomal_Ll 1) domain, is 
comprised of members that possess the LI 1 RNA binding domain; this family 
includes the ribosomal proteins Lll and LI 2, which are components of the large 
subunit. Lll is a protein of 140 to 165 amino-acids that binds to a 23S RNA 
molecule, the C-terminal region of which is buried within the ribosomal structure 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=Ribosomal_Ll 1). Another family of 
large ribosomal subunit proteins possess the ribosomal protein L13e 
(Ribosomal_L13e) domain, which is found in a wide range of vertebrates and in 
lower-order species (http://pfam.wustl.edu/cgi-bin/ getdesc?name=Ribosomal_L 1 3e), 
as is the ribosomal protein L44 (Ribosomal_L44) domain (http://pfam.wustl.edu/cgi- 
bin/getdesc?name= Ribosomal_L44). 

[02 1 9] Additional ribosomal protein families encompass small subunit 
proteins. The ribosomal protein S6e (Ribosomal_S6e) domain is present in a family 
of proteins which includes protein kinase substrates that control cell growth and 
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proliferation by selectively ti-anslating particular classes of mRNA 
(http://pfam.wusti.edu/cgi-biii/getdesc?name= Ribosomal_S6e). The ribosomal 
protein S8e (Ribosomal_S8e) domain is present in a family of proteins comprising 
approximately 220 amino acids in eukaryotes, and about 125 amino acids in 
archebacteria(littp://pfam.wusti.edii/cgi-bin/getdesc?name=Ribosomal_S8e). The 
ribosomal protein S10p/S20e (Ribosomal_S10) domain is present in a family of 
proteins which includes the small ribosomal subunit SIO from prokaryotes and S20 
from eukaryotes (http://pfam.wustl.edu/cgi-biii/getdesc?name= Ribosomal_S 10). SIO 
is involved in binding transfer RNA to the ribosome, and also operates as a 
transcriptional elongation factor. 
RNase-Related Sequences 

[0220] RNases are enzymes that cleave RNA. RNases generally 

recognize their targets by tertiary structiu-e, ratiier than by sequence; they include 
exonucleases, which remove the terminal base in an RNA sequence, and 
endonucleases, which can cleave non-terminal bases. Examples of RNases mclude 

RNase E, which is involved in the formation of 5S ribosomal RNA from pre- 
ribosomal RNA; RNase F, which cleaves both viral and host RNA in response to 
interferons, inhibiting protein synthesis; RNase H, which is specific for the RNA 
sti'and of an RNA-DNA hybrid; RNase P, which generates transfer RNA from 
precursor transcripts; and RNase T, which removes the terminal AMP ftom 
nonaminoacylated tRNA (CofBn, et al, 1997). 

[022 1 ] RNase-related sequences can possess or interact with rvt, rve, 
RNase H, and gagj)30 domains, all of which are described above. 

RNase H-Related Sequences 

[0222] RNase H is a nuclease specific for the RNA sfa^d of an RNA- 
DNA hybrid that cleaves phosphodiester bonds to produce molecules witii 3 -OH and 
5 -PO4 ends. Multiple forms of RNase H are present in both prokaryotes and 
eukaryotes. RNase H may be part of larger polypeptides and its activity can be 
influenced by other regions of these polypeptides (CofBn, et al., 1997; Crouch 1990). 

[0223] During retroviral rephcation, RNase H activity forms 
oligonucleotides that prime DNA synthesis. Therefore, the RNase H activity of 
reverse transcriptase is a target for therapeutic intervention. For example, small 
molecule inhibitors of retroviral RNase H function have shown promise in managing 
fflV infection (Klaiman, et al., 2002). 
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[0224] Another therapeutic indication for RNase H is the regulation of 
cancer genes by targeting mRNA translation. Antisense deoxyoligonucleotides down- 
regulate mRNA expression by annealing to specific regions of an mRNA. Formation 
of the DNA:RNA heteroduplex then triggers mRNA cleavage by RNase H. Cleavage 
is rapidly followed by further degredation, irreversibly preventing translation of the 
target mRNA. Antisense deoxyoligonucleotides that trigger RNase H activity can 
thus be used as cancer therapeutic agents (Crocke, 1996; Curcio et al., 1997). 

[0225J JRNase H-related sequences can possess or interact with mascH, 
Gag_p30, rvt, and rve domains, all of which are described above. 

SH3-Related Sequences. 

[0226] Src homology region 3 (SH3) is a polypeptide domain commonly 
found in intracellular signaling proteins; it binds with moderate affinity and selectivity 
to proline-rich ligands. SH3 domains are heterogeneous; different SH3 domains bind 
to different proline-rich sequences (Gmeiner and Horita, 2001). SH3 domains are 
involved in a wide variety of biological processes, including mediating the assembly 
of large multiprotein complexes, regulating enzyme activity, and modulating the local 
concentration or subcellular localization of signaling pathway components (Mayer, 
2001).- Examples of SH3 -related sequences include phosphotyrosine receptors, 
membrane associated guanylate kinases, mitogen-activated protein kinases, myosin 1, 
the Crk adaptor protein, phospholipase C-y, Grb2, Sos, src-SH3, Abl-SH3, the Nek 
adaptor, and alpha-spectrin-SH3. 

[0227] SH3-related sequences can possess or interact with SH3 

domains, which are protein domains of approximately 50-70 amino acids, and are 
present in a large number of proteins involved in intracellular signaling (http://pfMn. 
wustl.edu/cgi-bin/getdesc?name=SH3). SH3-related sequences can also possess or 
interact with SH3 domain-binding protein 5 (SH3BP5) domains, which are protein 
domains that act as a substrate for c-Jun N-terminal kinase (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=SH3BP5). 

Stem Cell-Related Sequences 

[0228] Stem cells are pluripotent or multipotent cells that generate maturing 
cells in multiple differentiation lineages. Pluripotent cells have the capacity to 
differentiate into each and every cell present in the organism. Embryonic stem cells 
are pluripotent; they can differentiate into any of the cells present in the adult. 
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Multipotent cells have the ability to differentiate into more than one cell type. Organ- 
specific stem cells are multipotent; they can differentiate into any of the cells of the 
organ they inhabit. 

[0229] When they divide in vivo, both pluripotent and multipotent stem cells 
can maintain their pluripotency or multipotency while giving rise to differentiated 
progeny. Thus, stem cells can produce repHcas of themselves which are pluri- or 
multipotent, and are also able to differentiate into lineage-restricted committed 
progenitor cells. For example, hematopoeitic stem cells, which are multipotent cells 
specifically able to form blood cells, can divide to produce replicate hematopoeitic 
stem cells. They can also divide to produce more highly differentiated cells, which 
are precursors of blood cells. The precursors differentiate, sometimes through . several 
generations of cells, into blood cells. A hematopoetic stem cell can also divide into a 
cell with the capacity to form, for example, a relatively undifferentiated cell that is 
committed to differentiate into, i.e., granulocytes, or erythrocytes, or another type of 
blood cell. 

[0230] Stem cells can also reproduce and differentiate in vitro. Embryonic 
stem cells have been directed to differentiate into cardiac muscle cells in vitro and, 
alternatively, into early progenitors of neural stem cells, and then into mature neurons 
and glial cells in vitro (Trounson, 2002). 

[023 1 ] Stem cell therapy is effective in treating cancer in humans (Slavin et 
al, 2001), and offers several advantages over traditional cancer therapies (Weissman, 
2000). One advantage of stem cell therapy exists when used in conjunction with 
radiation therapy. In radiation therapy for cancer, the dose of radiation necessary to 
kill the cancer cells in an organ can also be sufficient to destroy the healthy cells of 
the organ. In combined stem cell and radiation therapy, an organ is first treated with 
sufficient radiation to destroy all of the cancer cells and most or all of the healthy 
cells, but then stem cells are infused to repopulate the organ. In the ensuing weeks, as 
the cancer cells and healthy cells die, the stem cells replace the healthy cells. Another 
advantage of this approach, compared to heterologous organ transplants, is that there 
is no risk of rejection, since stem cells do not provoke an immune response. A further 
advantage is that stem cells are inherently programmed to regulate their numbers and 
differentiation status, i.e., once provided to the patient, the necessary number will 
differentiate, and the rest will remain undifferentiated (Weissman, 2000). 
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[0232] Stem cell therapy is also effective in treating autoimmune disease in 
hrnnans. For example, immunosuppression in conjvmction with stem-cell 
transplantation has induced remission in patients with refractory, severe rheumatic 
autoimmune disease (Van Laar and Tyndall, 2003). Patients with rheumatoid 
arthritis, systemic lupus erythematosus, systemic sclerosis, and juvenile idiopathic 
arthritis have benefited firom stem cell transplants (Van Laar and Tyndall, 2003). 

[0233] Preclinical studies also suggest the potential of stem cell 
transplantation for the treatment of neural and muscular injuries and disorders, 
including those of the central nervous system, peripheral nervous system, and skeletal, 
cardiac and smooth muscle (Deasy and Huard, 2002). Stem cells transplanted into the 
bone marrow of mice migrate to the site of injured muscle and differentiate into new 
muscle cells. For example, patients with myasthenia gravis, muscular dystrophies, 
amyotrophic lateral sclerosis, congestive heart failiue, Parkinson's disease, and 
Alzheimer's disease may benefit fi-om stem cell therapy (Henningson, 2003). 

[0234] In addition to therapeutic uses, research using stem cells can provide 
useful information about nonnal stem cell function and the pathogenesis of disease. 
Stem cells derived from a patient with a genetic disease can provide a tool for 
studying that disease. To derive these stem cells, a somatic cell, i.e., a cell that is not 
in the oocyte or spermatocyte lineage, is donated by the patient, and the nucleus is 
removed and transferred to an unfertilized human oocyte. This nuclear fransplant 
procedure produces, at the blastocyst stage of development, embryonic stem cells 
with the same set of genes as the patient with the genetic disease. Studying these 
cells, and their progeny in vitro, permits analysis of a specific model of the disease. 
For example, placing stem cells derived from a patient with a genetic disorder under 
the control of various stem cell regulatory factors can elicit abnormal responses from 
the affected stem cells compared to stem cells derived from a healthy individual's 
somatic nucleus. 

[0235] Embryonic stem cell-related sequences can possess or interact with 
the stem cell factor (SCF) domain, a transmembrane domain having a soluble, 
secreted form, which is involved in hematopoeisis, and which binds to and activates a 
receptor tyrosine kinase, stimulating the proliferation of mast cells and augmenting 
the proUferation of myeloid and lymphoid hematopoietic progenitors in bone marrow 
culture (http://pfam.wustl.edu/cgi-bin/getdesc?name=SCF). 
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[023 6] Certain stem cell related sequences can possess tiie ability to maintain 
the stem cell in undifferentiated state while allowing cell proliferation. Such 
compositions can be useftil in ex vivo cell therapy to e3q)and populations of cells for 
cell replacement therapy. 

[0237] Certain stem cell related sequences can possess the ability to cause 
cell differentiation to a relatively mature cell type and are useful to in vivo or ex vivo 
therapy to compensate for deficiency of such relatively mature cell type. 

Synthetase-Related Sequences 

[0238] A synthetase is an enzyme that catalyzes the synthesis of a 

molecule. Synthetases comprise a broad class of enzymes; they catalyze the synthesis 
of nucleic acids, peptides, and lipids (Agou et al., 1996). Examples of synthetases 
include lysyl-tRNA synthetase, aspai-aginyl t-RNA synthetase, holocarboxylase , 
synthetase, carbamyl phosphate synthetase I, and argininosuccinate synthetase. 

[0239] Synthetase-related sequences can possess or interact with transfer 
RNA synthetase domains, which are protein domains that activate amino acids and 
transfer them to specific transfer RNA molecules as a step in protein biosynthesis 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=tRNA-synt_2). The 20 aminoacyl- 
tRNA synthetases are divided into class 1 and class II, each of which contain multiple 
synthetases with different specificities. For example, there is a protein domain 
involved in the asparagines, aspartic acid, and lysine synthesis (http://pfam.wustl. 
edu/cgi-bin/textsearch?tenns=tma-synt&search_what=all& sections= 
DE&sections=CC&size=100). Synthetase-related sequences can also possess or 
interact with lipid-A-disaccharide synthetase (LpxB) domains, which are protein 
domains that catalyze the synthesis of disaccharides (http://pfam.Avustl.edu/cgi- 
bin/getdesc? name=LpxB). 

TATA Box-Related Sequences 

[0240] A TATA box is a consensus sequence in the promoter region of 

many eucaryotic genes that binds a general transcription factor and plays a role in 
specifying the position for transcription initiation. TATA boxes are generally found 
approximately 25 nucleotides before the site of transcription initiation (Chalut et al., 
1995). Examples of TATA box-related sequences include TATA box binding 
protein, 13 TATA/TBP, and small nuclear RNA-activating protein 190 Myb DNA. 

[024 1 ] TATA box-related sequences can possess or interact with 
transcription factor TFIID, also known as the TATA-binding protein (TBP) domain, 
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which is a protein domain that specifically binds to the TATA box promoter element 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=TBP). TATA box-related sequences 
can also possess or interact with HMG14 and HMG17 (HMG14_17) domains, which 
are members of a family of high mobility group proteins, described above 
(http://pfam.wustl. edu/cgi-bin/getdesc? name=HMG14_17). 
Tat-Related Sequences 

[0242J Tat is a human immunodeficiency vims (HIV) protein involved 

in viral production of new RNA genomes and new complete vkal particles. Tat is 
also involved in AIDS pathogenesis; it plays a role in reactivating latent viruses, e.g., 
the JC retrovirus; it is involved in the development of AIDS-related Kaposi^ 
Sarcoma; and it depresses the function of, and induces apoptosis in, helper Cp4 cells 
(Yu et al., 1995). Examples of Tat-related sequences include Tat-associated proteins, 
e.g., Tap, HIV-1 Rev, and tat-associated kinase (also known as positive transcriptional 
elongation factor b). 

[0243] Tat-related sequences can possess or interact with 

transactivating regulatory protein (Tat) domains, which are protein domains that 
contribute to efficient transcription of a viral genome (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=Tat). Tat-related sequences can also possess or interact with 
mitochondrial glycoprotein (MAM33) domains, which are protein domains found in 
mitochondrial matrix proteins, and which can be involved in mitochondrial oxidative 
phosphorylation and in interactions between the nucleus and tlie mitochondria 
(http://pfam.wustl.edu/cgi-bin/getdesc?name=MAM33). 

Transferase-Related Sequences 

[0244] Transferases are enzymes that transfer a designated group of 

atoms from a donor molecule to an acceptor molecule. For example, acyl transferases 
transfer acyl groups, methyl transferases transfer methyl groups, nucleotidyl 
transferases transfer nucleotides, prenyltransferases transfer prenyl groups, and 
glycosyl transferases transfer glycosyl groups (Lin et al., 1996). Examples of 
transferases include acetyltransferases, hydroxymethyltransferases, sialyltransferases, 
arginine N-methyltransferase, glucoronosyltransferase, NTP-transferase, and GDP- 
mannose pyrophosphorylase B. 

[0245] Transferase-related sequences possess or interact with UDP- 
glucuronosyl and UDP-glucosyl transferase domains, which are protein domains 
found in a superfamily of enzymes that catalyze the addition of the glycosyl group 
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from a UTP-sugar to a small hydrophobic molecule (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=UDPGT). Transferase-related sequences also possess or interact 
with nucleotide transferase (NTP_transferase) domains, which are protein domains 
that transfer nucleotides onto phosphorylated sugais (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=NTP_transferase). 

Transposase-Related Sequences 

[0246] Transposases are site-specific recombination enzymes that 

catalyze the transposition of a segment of DNA from one part of the genome to 
another. The movable segments are called transposable elements; each transposable 
element is occasionally moved by a transposase, which functions as an integrase, by 
inserting DNA sequences into other DNA sequences. Transposases are often encoded 
by the DNA of the transposable element itself. Transposases bind specifically to 
terminal inverted repeats of 10-500 bp that are characteristically part of transposable 
elements (Smit and Riggs, 1996). They catalyze both cuttuig and pasting of a 
transposable element from one segment of the genome to another. Sequences related 
to transposases can have other functions, e.g., as transcription factors, or in the 
assembly of centromere proteins (Smit and Riggs, 1996). Examples of transposase- 
related sequences include mariner, pogo, hobo, tigger, MER37, GaUleo, Ocean, 
hnpala, Tn MERIl, MsqTc3, and the sleeping beauty tiiansposon system (Robertson 
andZumpano, 1997; Robertson, 1996; Smit and Riggs, 1996). 

[0247] Transposase-related sequences can possess or interact with a 
transposase 1 (Transposase_l) domain, which is characterized by sequences that can 
excise and/or insert mobile genetic elements such as transposons or insertion 
sequences; for example, mariner possesses a transposase 1 domain 
(http://pfam.wusti.edu/cgi-bin/getdesc? name= Transposase_l). Transposase-related 
sequences can also possess or interact with LI transposable element (Transposase_22) 
domains, which have been described above. Transposase-related sequences can also 
possess or interact with a DDE endonuclease (DDE) domain, which is responsible for 
coordinating metal ions needed for endonuclease catalytic activity (http://pfam.wustl. 
edu/cgi-bin/getdesc? name=DDE). Transposase-related sequences can additionally 
possess or interact with a zinc finger, C2H2 type (zf-C2H2) domain, which bind 
nucleic acids using a mechanism that involves coordinating a zinc atom with a pair of 
cysteine residues and a pair of histidine residues (http://pfam.wustl.edu/cgi- 
bin/getdesc?name=zf-C2H2). Transposase-related sequences can also possess or 
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interact with a reverse transcriptase (rvt) domain, and/or a low-density lipoprotein 
receptor (Idl rece) domain, both of which are described above. 
Ubiquitin-Related Sequences 

[0248] Ubiquitin is a protein found in all eucaryotic cells examined to 
date. When it is linked to the lysine side chain of a protein by the formation of an 
amide bond with its C-terminal glycine, ubiquitin renders the ubiquitin-bound protein 
subject to rapid proteolysis in the proteasome. In addition to its role in the selective 
degradation of cellular proteins, ubiquitin also plays a role in maintaining 
chromosome structure, regulating gene expression, responding to stresses on the 
organism, the regulation of gene expression, and ribosome biogenesis. Examples of 
ubiquitin-ielated sequences include elongins, ubiquitin-specific proteases, ubiquitin- 
calmodulin ligase, ubiquitin carrier protein kinase, ubiquitin N-alpha-protein 
hydrolase, and the small ubiquitin-related modifier (Sumo-1) (Kamitani et al., 1997). 

[0249] . Ubiquitin-related sequences can possess or interact with a 
ubiquitin domain, which is a conserved sequence of approximately 76 amino acid 
residues that comprise the protein ubiquitin (http://pfam.vnistl.edu/cgi- 
bin/getdesc?name=ubiquitin). Ubiquitin-related sequences can also possess or 
interact a ubiquitin carboxyl-tenninal hydrolase (UCH) domain, which is a protein 
domain that comprises a thiol protease that recognizes and hydrolyses the peptide 
bond at the C-terminal glycine of ubiquitin (http://pfam.wustl.edu/cgi-bin/get 
desc?name=UCH). 

Virus-Related Sequences 

[0250] The human chromosome has integrated endogenous genes that 
are related to viral genes. Some endogenous viral genes, e.g., the retroviral HERV-W 
family, are widely and heterogeneously dispersed among human chromosomes 
(Voisset et al., 2000; Everett et al., 1997; Werner et al., 1990). Endogenous 
proviruses are usually transcriptionally silent, but are expressed under certain 
conditions (Coffin et al., 1997). Endogenous viral expression can be specific to host 
factors, such as cell type or stage of differentiation, as well as other factors including 
the position on the chromosome, the influence of cis-acting sequences, or the presence 
of host-mediated DNA methylation (Coffin). 

[025 1] Endogenous viral expression can have a number of 
consequences, both beneficial and detrimental. Among the beneficial consequences is 
the ability of endogenous retroviruses to confer resistance to infection by exogenous 
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virases. For example, mice with endogenous mouse mammary tumor virus (MMTV) 
can be immune to exogenous infection (Golovkina, et al., 1992). Among the 
detrimental effects is a causative role in disease. Evidence indicates an association 
between endogenous viruses with cancers and autoimmune diseases (Coffin et al., 
1997). For example, spontaneous tumors of specific origin, murine mammary 
adenocfflrcinomas, and murine T-cell lymphomas have been associated with the 
presence of specific endogenous retroviruses. Furthermore, a transformed phenotype 
is associated with the increased transcription of certain classes of endogenous viral 
elements (Coffin et al., 1997). With respect to autoimmune disease, an endogenous 
virus that influences the immunoregulatory process has been associated with 
spontaneous autointmiune thyroiditis in a chicken model of human Hashimoto disease 
(Wick et al., 1987). Examples of viral-related proteins include hepatitis B virus x- 
interacting protein, herpesvirus associated ubiquitin-specific protease, and 
Coxsackievirus and adenovirus receptor precursor. 

[0252] Viral-related sequences can possess or interact with rvt, rve, and 
gag_p30 sequences, all of wliich are described above. 

Zinc Finger-Related Sequences 

[0253] A zinc finger domain is a small, self-folding, structural motif of 25 to 
30 amino-acid residues present in many nucleic acid-binding proteins. It is comprised 
of a polypeptide loop held in a hairpin bend and bound to a zinc atom, and includes 
two conserved cysteine and two conserved histidine residues. Many classes of zinc 
fingers have been characterized according to the number and positions of the 
conserved histidine and cysteme residues. The amino acid configuration that holds 
the zinc atom in a tetrahedral array has a finger-like projection that interacts with 
nucleotides in the major groove of the bound nucleic acid. Zinc finger motifs have 
conserved regions near the zinc molecule, and variable regions at the nucleic acid 
binding site that provide specificity for the nucleic acid sequences they bind. Zinc 
finger proteins have a variety of fimctions, including as transcription regulators and 
intracellular receptors. Zinc finger domains are also involved in protein-protein 
interactions, e.g., those involving protein kinase C. Recently, zinc finger nucleases 
have been used to target genes for gene replacement by homologous recombination 
(Bibikova et al., 2003). Examples of zinc finger proteins include XC3H-3b, the 
transcription factor Slug, and transcription factor IHA. 
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[0254] Zinc finger-related sequences can possess or interact with a zinc 
finger C2H2 type (zf-C2H2) domain, which binds a zinc atom with two cysteine and 
two histidine residues, and is utilized, e.g., in RNA transcription (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=zf-C2H2). Zinc finger-related sequences can also possess 
or interact with a C3HC4 type, RING finger (zf-C3HC4) domain, which is a 
specialized type of zinc finger domain comprised of 40 to 60 amino acids that binds 
two zinc atoms; variants of RING-finger , domains include the C3HC4-type and the 
C3H2C3-type (http://pfam.wustl.edu/cgi-bm/getdesc?name=^f-C3HC4). Proteins 
with RING-finger domains have developmental and functional roles; they are 
involved in intracellular receptor, binding, and in mediating proteui-protein. 
mteractions (Gray et al., 2000). RING-finger domains can exhibit ubiquitin-protein 
ligase activity, and can bind to E2 ubiquitin-conjugating enzymes. 

[0255] Zinc finger-related sequences can also possess or interact with a zinc 
knuckle (zf-CCHC) domain, which is an 18-amino acid zinc finger domain found in 
RNA-binding and single strand DNA-bindmg proteins; they are often involved in 
eukaryotic gene regulation (http://pfam.wustl.edu/cgi-bin/getdesc?name=zf-CCHC). 
Zinc knuckles are also found in retroviral gag and nucleocapsid proteins, where they 
function in genome packaging, and early in the infection process. Zinc finger-related 
sequences can also possess or interact with a BTB/POZ (BTB) domain, which 
mediates both homomeric and heteromeric protein dimerization (http://pfam.wustl. 
edu/cgi-bin/getdesc?name=BTB). Zinc finger-related sequences can also possess or 
interact with NF-Xl type zinc finger (zf-NF-Xl) domains, which are found in the 
transcriptional repressor NK-Xl, where they repress transcription of HLA-DRA, and 
m tiie shuttie craft protein, which plays a role in late stage embryonic neurogenesis 
(http://pfam.wusti.edu/cgi-bin/getdesc?name'=zf-NF-Xl). Zmc finger-related 
sequences can also possess or interact with a BCRAB box (KRAB) domain, also 
known as a Kruppel-associated box, which is comprised of approximately 75 ammo 
acids, enriched in charged amino acids, and mvolved in protein-protein interactions 
(http://pfam.wustl.edu/cgi-bin/getdesc? name=KRAB). KRAB domains can function 
as transcription factors, e.g., as a transcriptional repressor, and can assume roles in 
cell differentiation and development (Aubry et al., 1992; Lovering and Trowsdale, 
1991). Zmc fmger-related sequences can possess or interact with a transposase_22 
domain, which is described above. 
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Industrial Applicability 
[0256] The invention provides sequences related to secreted sequences, 
single-transmembrane sequences, multiple-transmembrane sequences, kinase-related 
sequences, ligase-related sequences, nuclear hormone receptor-related sequences, 
phosphatase-related sequences, protease-related sequences, phosphodiesterase-related 
sequences, kinesin-related sequences, iImnunoglob^llin-related sequences, T-cell 
receptor-related sequences, glycosylphosphatidylinositol anchor-related sequences, 
and sequences related to other nucleic acid and amino acid sequences of the invention, 
including activators, adaptors, adhesion molecules, ATPases, ATP, breakpoints, 
channels, checkpoints, complexes, dehydrogenases, disintegrins, endopeptidases, 
genn-cells, GTPases, helicases, hydrolases, integrases, integrins, isomerases, 
membranes, mucins, oxygenases, peroxidases, phopholipases, prosaposins, 
proteosomes, reductases, reverse trancriptases, RNases, RNases H, SH3, synthetases, 
TATA boxes. Tat proteins, transferases, ti-ansposases, ubiquitins, and viruses. The 
invention provides for novel polynucleotides, related novel polypeptides and active 
fragments thereof, as well as novel nucleic acid compositions encoding these 
polypeptides, compositions comprising the related polypeptides, and methods for their 
use. 

[0257] The present invention also provides for vectors, host cells, and 
methods for producing the polynucleotides and polypeptides of the invention in these 
vectors and host cells. The present invention further provides for antisense molecules 
that are capable of regulating the expression of the polynucleotides or polypeptides 
herein. In addition, modulators, including antibodies that bind specifically to the 
polypeptides or modulate the activity of the polypeptides, are also provided. 

[0258] The present polynucleotides, polypeptides, and modulators find 
use in therapeutic agent screening/discovery applications, such as screening for 
receptors or competitive ligands, for use, for example, as small molecule therapeutic 
drugs. Also provided are methods of modulating a biological activity of a polypeptide 
and methods of treating associated disease conditions, particularly by administering 
modulators of the present polypeptides, such as small molecule modulators, antisense 
molecules, and specific antibodies. 

[0259] The present polypeptides, polynucleotides, and modulators find 
use in a number of diagnostic, prophylactic, and therapeutic applications. The 
polynucleotides and polypeptides of the invention can be detected by methods 
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provided herein; these methods are useful in diagnosis, and can be accomplished by 
the use of diagnostic kits. The polynucleotides and polypeptides of the invention are 
useful for treating a variety of disorders, including cancer, proliferative disorders, 
inflammatory disorders, immune disorders, viral disorders, bacterial disorders, and 
metabohc disorders. For example, subjects who suffer from a deficiency, or a lack of 
a particular protein, or are otherwise in need of such protein to repair or enhance a 
desirable function, benefit from the administration of a protein or an active fragment 
thereof by any conv^tional routes of administration. These include therapeutic 
vaccines in the form of nucleic acid or polypeptide vaccines, such as cancer vaccines, 
where the vaccines can be administered alone, such as naked DNA, or can be 
facilitated, such as via viral vectors, microsomes, or liposomes. Therapeutics 
antibodies include those that are administered alone or in combination with cytotoxic 
agents, such as radioactive or chemotherapeutic agents. 

[0260] In particular, the polypeptides, polynucleotides, and modulators 
of the present invention can be used to treat cancers, including, but not limited to, 
cancers of the prostate, breast, bone, soft tissue, liver, kidney, ovary, cervix, skin, 
pancreas, and brain, as well as leukemias, lymphomas, lung cancers such as 
adenocarcinomas and squamous cell carcinoma, and cancers of gastrointestinal organs 
such as stomach, colon, and rectum. Further, the polypeptides, polynucleotides, and 
modulators of the present invention can be used to treat inflammatory, immune, viral, 
bacterial, and metabolic diseases, disorders, syndromes, or conditions, includmg, but 
not limited to, intestinal inflanmiation and immunity, autoimmune thyroiditis, and 
retroviral infections, as well as tissue and/or organ hypertrophy. 

Disclosure of The Invention 
[026 1 ] The present invention features an isolated polynucleotide that 
encodes a polypeptide. In some embodiments, the polypeptide has at least about 70%, 
at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 

about 95%, at least about 97%, at least about 98%, or at least about 99% amino acid 
sequence identity with an amino acid sequence derived from a polynucleotide 
sequence chosen from at least one nucleotide sequence according to SEQ ID NOS.: 1- 
104. In some embodiments, the polypeptide has an amino acid sequence chosen from 
at least one amino acid sequence encoded by SEQ ID NOS.; 1-104. In many 
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embodiments, the polypeptide has at least one activity associated with the naturally 
occuixing encoded polypeptide. 

[0262] hi some embodiments, the polypeptide includes a signal peptide, hi 
alternative embodiments, the polypeptide comprises a mature form of a protein, from 
which the signal peptide has been cleaved, hi other embodiments, the polypeptide is a 
signal peptide. In a further aspect, the invention provides fragments of a polypeptide 
chosen from at least one amino acid sequence encoded by SEQ ID NO.: 1-104, 
where each fi-agment is an extracellular fragment of the polypeptide, or an 
extracellular fragment of the polypeptide minus the signal peptide. The invention 
provides an N-terminal fragment containing a Pfem domain, and a C-tenninal 
fragment containing a Pfam domain and either or both may be biologically active. 

[0263] In yet other embodiments, the polypeptides function as secreted 
proteins. In yet fiirfher embodiments, the polypeptides function as single- 
transmembrane proteins. In yet fiirther embodiments, the polypeptides fimction as 
multiple-transmembrane proteins, ha yet fiirther embodiments, the polypeptides 
function as kinases. In yet further embodiments, the polypeptides fimction as protein 
kinases, hi yet further embodiments, the polypeptides fimction as Ugases. In yet 
further embodiments, the polypeptides fimction as nuclear homione receptors. In yet 
fiuther embodiments, the polypeptides fimction as phosphatases. In yet fiirther 
embodiments, the polj'peptides fimction as proteases. In yet further embodiments, the 
polypeptides function as phosphodiesterases.- In yet further embodiments, the 
polypeptides function as kinesins. In yet further embodiments, the polypeptides 
function as immunoglobulins. In yet further embodiments, the polypeptides function 
as T-cell receptors. In yet further embodiments, the polypeptides function as 
glycosylphosphatidylinositol anchors. 

[0264] In yet further embodiments, the polypeptides function as cytokines. 
In still further embodiments, the polypeptides function as immime cells. In further 
embodiments, the polypeptides fimction as antigens, hi yet further embodiments, the 
pol3^eptides fimction as receptors. In other embodiments, the polypeptides fimction 
as binding proteins. In other embodiments, the polypeptides fimction as factors. In 
further embodiments, the polypeptides fimction as growth factors. In fiirther 
embodiments, the polypeptides function as heat-shock proteins. In some 
embodiments, the polypeptides fimction as membrane transport proteins. In yet 
further embodiments, the polypeptides function as ribosomal proteins. In some 
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embodiments, the polypeptides function as zinc fingers. In some embodiments, the 
polypeptides function as embryonic stem cell-related peptides. In still further 
embodiments, the polypeptides function in pathological states. In other embodiments, 
the polypeptides function as one or more of these. 

[0265] In yet fiirther embodiments, the polypeptides function as activators. 
In yet further embodiments, the polypeptides function as adaptors. In yet further 
embodiments, the polypeptides fimction as adhesion molecules. In yet further 
embodiments, the pqlypeptides function as ATPases. In yet fiirther embodiments, the 
polypeptides function as ATP-related polypeptides. In further embodiments, the 
polypeptides function as channels-related polypeptides. In yet further embodiments, 
the polypeptides function as checkpoint-related polypeptides. In yet further 
embodunents, the polypeptides function as complexes. In yet further embodiments, 
the polypeptides function as dehydrogenases. In yet further embodunents, the 
polypeptides function as disintegrins. In yet further embodiments, the polypeptides 
fimction as endopeptidases. In yet further embodunents, the polypeptides function as 
germ-cells. In yet fiirther embodiments, the polypeptides function as GTPases. In yet 
further embodiments, the polypeptides function as helicases. In yet further 
embodiments, the polypeptides function as hydrolases. In yet further embodiments, 
the polypeptides function as integrases. In yet further embodiments, the polypeptides 
fimction as integrins. In yet further embodiments, the polypeptides fimction as 
isomerases. In yet further embodiments, the polypeptides fimction as membranes. In 
yet fiirther embodunents, the polypeptides fimction as mucins. In yet further 
embodiments, the polypeptides function as oxygenases. In yet further embodiments, 
the polypeptides fimction as peroxidases. In some embodiments, the polypeptides 
function as phospholipases. In yet further embodiments, the polypeptides function as 
prosaposins. In yet further embodiments, the polypeptides function as proteasomes. 
In yet further embodiments, the polypeptides function as reductases. In other 
embodiments, the polypeptides function as reverse transcriptase-related polypeptides. 
In yet fiirther embodiments, the polypeptides function as RNases. In further 
embodiments, the polypeptides function as RNase H-related polypeptides. In yet 
further embodiments, the polypeptides function as SH3-related polypeptides. In yet 
fiirther embodiments, tlie polypeptides function as synthetases. In yet further 
embodiments, the polypeptides function as TATA box-related polypeptides. In yet 
further embodiments, the polypeptides fimction as TAT-related polypeptides. In yet 
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further embodiments, the polypeptides function as transferases. In yet further 
embodiments, the polypeptides function as transposases. In yet further embodiments, 
the polypeptides function as ubiquitin-related polypeptides. In yet further 
embodiments, the polypeptides function as virus-related polypeptides. In other 
embodiments, the polypeptides function as one or more of these. 

[0266] The present invention features an isolated polynucleotide that 
hybridi2ES under stringent hybridization conditions to a coding region of at least one 
nucleotide sequence shown in SEQ ID NOS.: 1 - 104, or a complement thereof. 

[0267] The present invention features an isolated polynucleotide that shares 
at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 
99% nucleotide sequence identity with a nucleotide sequence of the coding region of 
at least one sequence shown in SEQ ID NOS.: 1 - 104, or a complement thereof In 
some embodiments, a subject polynucleotide has the nucleotide sequence shown in at 
least one of SEQ ID NOS. : 1 - 1 04, or a coding region thereof 

[0268] The present invention also features a vector, e.g., a recombinant 
vector, that includes a subject polynucleotide, and a promoter the drives its 
expression. This vector can transform a host cell, and the present invention further 
features such host cells, e.g., isolated in vitro host cells, and in vivo host cells, that 
comprise a polynucleotide of the invention, or a recombinant vector of the invention. 

[0269] The present invention further features a library of polynucleotides, 
wherein at least one of the polynucleotides comprises the sequence information of a 
polynucleotide of the invention. In specific embodiments, the library is provided on a 
nucleic acid array. In some embodiments, the library is provided in computer- 
readable format. 

[0270] The present invention features a pair of isolated nucleic acid 
molecules, each from about 10 to about 200 nucleotides in length. The first nucleic 
acid molecule of the pair comprises a sequence of at least 10 contiguous nucleotides 

having 100% sequence identity to at least one nucleic acid sequence shown in SEQ ID 
NOS.: 1 - 104. The second nucleic acid molecule of the pair comprises a sequence of 
at least 10 contiguous nucleotides having 100% sequence identity to the reverse 
complement of at least one nucleic acid sequence shown in SEQ ID NOS.: 1 - 104. 
The sequence of said second nucleic acid molecule is located 3 ' of the nucleic acid 
sequence of the first nucleic acid molecule shovm in SEQ ID NOS.: 1 - 104. The pair 
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of isolated nucleic acid molecules are useful in a polymerase chain reaction or in any 
other method known in the art to amplify a nucleic acid that has sequence identity to 
the sequences shown in SEQ ID NOS.: 1 - 104, particularly when cDNA is used as a 
template. 

[0271] The invention features a method of determining the presence of a 
polynucleotide substantially identical to a polynucleotide sequence shown in the 
Sequence Listing, or a complement of such a nucleotide by providing its complement, 
allowing the polynugleotides to interact, and determining whether such interaction has 
occurred. 

[0272] The invention further features methods of regulating the expression 
of the subject polynucleotides and encoded polypeptides. The invention provides a 
method of inhibiting transcription or translation of a first polynucleotide encoding a 
first polypeptide of the invention by providing a second polynucleotide that 
hybridizes to the first polynucleotide, and allowing the first polynucleotide to contact 
and bind to the second polynucleotide. The second polynucleotide can be chosen 
fiom an antisense molecule, a ribozyme, and an interfering RNA (RNAi) molecule. 

[0273] The present invention fiirtlier features an isolated polypeptide, e.g., an 
isolated polypeptide encoded by a polynucleotide, and biologically active fragments 
of such polypeptide. In some embodiments, the polypeptide is a fiision protein. In 
some embodiments, the polypeptide has one or more ammo acid substitutions, and/or 
insertions and/or deletions, compared with at least one sequence shown in SEQ ID 
NOS.: 1 - 104. In some embodiments, the polypeptide has an amino acid sequence 
derived from at least one nucleotide sequence shown in SEQ ED NOS.: 1-104. 

[0274] The invention also provides a method of making a polypeptide of the 
invention by providing a nucleic acid molecule that comprises a polynucleotide 
sequence encoding a polypeptide of the invention, introducing the nucleic acid 
molecule into an expression system, and allowing the polypeptide to be produced. 

[0275] In some embodiments, the method involves in vitro cell-free 
transcription and/or translation. For example, the expression system can comprise a 
cell-free expression system, such as anE. coli system, a wheat germ extract system, a 
rabbit reticulocyte system, or a frog oocyte system. 

[0276] In certain other embodiments, the expression system can comprise a 
prokaryotic or eukaryotic cell, for example, a bacterial cell expression system, a 
fungal cell expression system, such as yeast ox Aspergillus, a plant cell expression 
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system, e.g., a cereal plant, a tobacco plant, a tomato plant, or other edible plant, an 
insect cell expression system, such as SF9 of High Five cells, an amphibian cell 
expression system, a reptile cell expression system, a crustacean cell expression 
system, an avian cell expression system, a fish cell expression system, or a 
mammalian cell expression system, such as one using Chinese Hamster Ovary (CHO) 
cells. In some embodiments, the method involves culturing a subject host cell under 
conditions such that the subject polypeptide is produced by the host cells; and 
recovering the subject polypeptide from the culture, e.g., from within the host cells, or 
from the culture medium. In further embodiments, the polypeptide can be produced 
in vivo in a multicellular animal or plant, comprising a polynucleotide encoding the 
subject polypeptide. 

[0277] The present invention further features a non-human animal injected 
with at least one polynucleotide comprising at least one nucleotide sequence chosen 
from SEQ ID NOS.: 1 - 104, and/or at least one polypeptide comprising at least one 
amino acid sequence encoded by SEQ ID NOS.: 1 - 104. 

[0278] The present invention further features an antibody that specifically 
recognizes, binds to, interferes with, or modulates the biological activity of a subject 
polypeptide or a fragment thereof. The polypeptide can be a single-transmembrane 
protein, multiple-transmembrane protein, kinase, protein kinase, ligase, nuclear 
honnone receptor, phosphatase, protease, phosphodiesterase, kinesin, 
immunoglobulin, T-cell receptor, glycosylphosphatidylinositol anchor, or other 
nucleic acid and amino acid sequences, including, activators, adaptors, adhesion 
molecules, ATPases, ATP, breakpoints, channels, checkpoints, complexes, 
dehydrogenases, disintegrins, endopeptidases, germ-cells, GTPases, helicases, 
hydrolases, integrases, integrins, isomerases, membranes, mucins, oxygenases, 
peroxidases, phospholipases, prosaposins, proteasomes, reductases, reverse 
transcriptases, RNases, RNases H, SH3, synthetases, TATA boxes. Tat, transferases, 
ti-ansposases, ubiquitins, and vfruses. The fragment can be an exfracellular fragment 
of a subject polypeptide, or an exti-acellular fragment of a subject polypeptide mmus 
the signal peptide. 

[0279] The present invention further features an antibody that specifically 
inhibits binding of a polypeptide to its ligand or substrate. It also features an antibody 
that specifically inhibits binding of a polypeptide as a substrate to another molecule. 
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[0280] Another aspect of the present invention features a library of 
antibodies or fragments thereof, wherein at least one antibody or fragment thereof 
specifically binds to at least a portion of a polypeptide comprising an amino acid 
sequence encoded by SEQ ID NOS.: 1 - 104, and/or wherein at least one antibody or 
fragment thereof interferes with at least one activity of such polypeptide or fragment 
thereof. In certain embodiments, the antibody library comprises at least one antibody 
or fragment thereof that specifically inhibits binding of a subject polypeptide to its 
ligand or subsliate, pr that specifically inhibits binding of a subject polypeptide as a 
substrate to another molecule. The present invention also features corresponding 
polynucleotide libraries comprising at least one polynucleotide sequence that encodes 
an antibody or antibody fragment of the invention. In specific embodiments, the 
library is provided on a nucleic acid array or in computer-readable format. 

[028 1] An antibody of the present invention may comprise a monoclonal 
antibody, polyclonal antibody, single chain antibody, intrabody, and active fragments 
of any of these. The active fragments include variable regions from either heavy 
chains or light chains. The antibody can comprise the backbone of a molecule with an 
immunoglobulin domain, e.g., a fibronectin backbone, a T-cell receptor backbone, or 
a CTLA4 backbone. 

[0282] The present invention further features a targeting antibody, a 
neutralizing antibody, a stabilizing antibody, an enhancing antibody, an antibody 
agonist, an antibody antagonist, an antibody that promotes cellular endocytosis of a 
target antigen, a cytotoxic antibody, and an antibody that mediates antibody 
dependent cellular cytotoxicity (ADCC). The antibody that mediates ADCC can have 
a cj^otoxic component, e.g., a radioisotope, a radioactive molecule, a microbial toxin, 
a plant toxm, a chemotherapeutic agent, or a chemical substance, such as doxorubicin 
or cisplatin. The mvention also features an inhibitory antibody, functioning to 
specifically inhibit the binding of a cognate polypeptide to its ligand or its subsfrate, 
or to specifically inhibit the binding of a cognate peptide as the substrate of another 
molecule. 

[0283] The antibodies of the present invention also encompass a human 
antibody, a non-human primate antibody, a monkey antibody, a non-primate animal 
antibody, e.g., a rodent antibody, rat antibody, a mouse antibody, a hamster antibody, 
a guinea pig antibody, a chicken antibody, a cattie antibody, a sheep antibody, a goat 
antibody, a horse antibody, porcine antibody, a cow antibody, a rabbit antibody, a cat 
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antibody, or a dog antibody. It also feattires a humanized antibody, a primatized 
antibody, and a chimeric antibody. 

[0284] The antibodies of the invention can be produced in vitro or in vivo. 
For example, the present invention features an antibody produced in a cell-free 
expression system, a prokaryote expression system or a eukaryote expression system, 
as described herein. 

[0285] The invention ftu-ther provides a host cell tliat can produce an 
antibody of the invention or a fragment thereof. The antibody may also be secreted 
by the cell. The host cell can be a hybridoma, or a prokaryotic or eukaryotic cell. . 
The invention also provides a bacteriophage or other virus particle comprising an 
antibody of the invention, or a fragment thereof. The bacteriophage or other virus 
particle may display the antibody or fragment thereof on its surface, and the 
bacteriophage itself may exist within a bacterial cell. The antibody may also 
comprise a fusion protein with a vkal or bacteriophage protein. 

[0286] The invention fiirther provides transgenic multicellular organisms, 
e.g., plants or non-human animals, as well as tissues or organs, comprising a 
polynucleotide sequence encoding a subject antibody or fragment thereof. The 
organism, tissues, or organs will generally comprise cells producing an antibody of 
the invention, or a fragment thereof. 

[0287] In another aspect, the present invention features a method of maldng 
an antibody by immunizing a host animal. In this method, a polypeptide or a 
fragment thereof, a polynucleotide encoding a polypeptide, or a polynucleotide 
encoding a fragment thereof, is introduced into an animal in a sufficient amount to 
eUcit the generation of antibodies specific to the polypeptide or fragment thereof, and 
the resulting antibodies are recovered from the animal. The polypeptide can be 
encoded by a nucleic acid molecule comprising a nucleotide sequence chosen from at 
least one polynucleotide sequence according to SEQ ED NOS.: 1-104. 

[0288] The invention thus also provides a non-human animal comprismg an 
antibody of the invention. The animal can be a non-human primate, (e.g., a monkey) 
a rodent (e.g., a rat, a mouse, a hamster, a guinea pig), a chicken, cattle (e.g., a sheep, 
a goat, a horse, a pig, a cow), a rabbit, a cat, or a dog. 

[0289] The present invention also features a method of making an antibody 
by isolating a spleen from an animal injected with a polypeptide or a fragment 
thereof, a polynucleotide encoding a polypeptide, or a polynucleotide encoding a 
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fragment thereof, and recovering antibodies from the spleen cells. Hybridomas can be 
made from the spleen cells, and hybridomas secreting specific antibodies can be 
selected, 

[0290] The present invention fiirther features a method of making a 
polynucleotide library from spleen cells, and selecting a cDNA clone tliat produces 
specific antibodies, or fragments thereof.. The cDNA clone or a fragment thereof can 
be expressed in an expression system that allows production of the antibody or a 
fragment thereof, as provided herein. 

[0291] The invention also provides a method for determining the presence or 
measuring the level of a polypeptide that specifically binds to an antibody of the 
invention. This method involves allowing the antibody to interact with a sample, and 
determining whether interaction between the antibody and any polypeptide in the 
sample has occurred. Antibodies that specifically bind to at least one subject 
polypeptide are useftil in diagnostic assays, e.g., to detect the presence of a subject 
polypeptide. Similarly, the invention features a method of determining the presence 
of an antibody to a polypeptide of the invention, by providing the polypeptide, 
allowing the antibody and the polypeptide to interact, and determining whether 
interaction has occurred. 

[0292] The present invention ftirther features a method of identifying an 
agent that modulates the level of a subject polypeptide (or an mRNA encoding a 
subject polypeptide) in a cell. The method generally involves contacting a cell (e.g., a 
eukaryotic cell) that produces the subject polypeptide with a test agent; and 
determining the effect, if any, of the test agent on the level of the polypeptide in the 
cell. 

[0293] The present invention fiirther features a method of identifying an 
agent that modulates biological activity of a subject polypeptide. The methods 

generally involve contacting a subject polypeptide with a test agent; and determining 
the effect, if any, of the test agent on the activity of the polypeptide. In certain 
embodiments, the polypeptide is expressed on a cell surface. In certain embodiments, 
the agent or modulator is an antibody, for example, where an antibody binds to the 
polypeptide or affects its biological activity. 

[0294] The present invention further features biologically active agents (or 
modulators) identified using a mettiod of the invention. 
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[0295] The present invention also features a method of modulating 
biological activity using an agent selectable by the above methods. Briefly, the 
method of modulating biological activity comprises contacting the agent with a first 
human or a non-human host cell, thereby modulating the activity of the first host cell 
or a second host cell. In one example, contacting the agent with the first human or 
non-human host cell results in the recruitment of a second host cell. The agent may 
be an antibody or antibody ftagment of the invention. 

[0296] The modulation can comprise directly enhancing cell activity, 
indirectly enhancing cell activity, directly inhibiting cell activity, or indirectly 
inhibiting cell activity. The cell activity that is modulated can include transcription, 
translation, cell cycle control, signal transduction, intracellular trafficking, cell 
adhesion, cell mobility, proteolysis, ion transport, water transport, DNA repaiTj 
hydrolysis, lipase activity, polymerization using an RNA temple or a DNA template, 
and nuclease activity. The modulation can result in cell death or apoptosis, or 
inhibition of cell death or apoptosis, as well as cell growth, cell proliferation, or cell 
survival, or inhibition of cell growth, cell proliferation, or cell survival; as well as 
mucosal preservation, inhibition of eicosanoid synthesis, or resistance to infection by 
viruses. 

[0297] Either tlie first or the second host cell can be a human or a non- 
human host cell. Either the first or the second host cell can be an imnume cell, e.g., a 
T cell, B cell, NK cell, dendritic cell, macrophage, muscle cell, stem cell, skin cell, fat 
cell, blood cell, brain cell, bone marrow cell, endothelial cell, retinal cell, bone cell, 
kidney cell, pancreatic cell, liver cell, spleen cell, prostate cell, cervical cell, ovarian 
cell, breast cell, lung cell, liver cell, soft tissue cell, colorectal cell, other cell of the 
gastrointestinal tract, or a cancer cell. 

[0298] The invention also provides a method of diagnosing cancer, 
proliferative, inflammatory, immune, viral, bacterial, or metabolic disorder in a 
patient, by allowing an antibody specific for a polypeptide of the invention to contact 
a patient sample, and detecting specific binding between the antibody and any antigen 
in the sample to determine whether the subject has cancer, proliferative, 
inflammatory, immune, viral, bacterial, or metabolic disorder. 

[0299] The invention fiarther provides a method of diagnosing cancer, 
proliferative, inflammatory, immune, viral, bacterial, or metabolic disorder in a 
patient, by allowing a polypeptide of the invention to contact a patient sample, and 
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detecting specific binding between the polypeptide arid any interacting molecule in 
the sample to determine whether the subject has cancer, proliferative, inflammatory, 
immune, viral, bacterial, or metabolic disorder. 

[0300] The invention also features a method of providing a polynucleotide, a 
polypeptide, or an agent of the invention, such as an antibody, to a subject by oral, 
buccal, nasal, rectal, intraperitoneal, intradermal, transdermal, intratracheal, 
intrathecal, or parenteral administration, or otherwise by implantation or inhalation. 
For example, the polynucleotide, polypeptide or agent can be administered 
intranasally, intravenously, intra-arterially, intracardiacally, subcutaneously, 
intraperitoneally, transdermally, intraventricularly, or intracranially. The invention 
also provides a method for formulating a polynucleotide, polypeptide, or modulator 
composition, such as an antibody composition, for delivery by any of the routes of 
administration provided above, for example, for treatment of disorders. For example, 
the parenteral delivery can be via uihalation or implantation. The parenteral delivery 
can also be oral, intranasal, intraventricular, or intracranial. 

[0301] The present invention also features a pharmaceutical composition 
comprising a polynucleotide, polypeptide, or modulator of the invention and a carrier. 
The carrier can be a pharmaceutically acceptable carrier. The modulator can be 
obtainable by any methods of the invention, for example, the modulator can be an 
antibody or a fragment thereof Further, oral formulations, preparations for injection, 
aerosol formulations, and suppositories can be prepared, each comprising the 
polynucleotide, polypeptide, or modulator composition. Further, nucleic acid 
compositions comprising polynucleotide sequences encoding the subject antibodies, 
or fragments thereof, can be prepared for administration to a subject. 

[0302] The invention also features a non-human animal injected with the 
polynucleotide, polypeptide, or modulator composition, for example the antibody 
composition. Again, the animal can be a non-himian primate, (e.g., a monkey) a 
rodent (e.g., a rat, a mouse, a hamster, a guinea pig), a chicken, cattle (e.g., a sheep, a 
goat, a horse, a pig, a cow), a rabbit, a cat, or a dog. 

[0303] In another aspect, the invention provides a method of treating a 
disorder in a subject needing or desiring such treatment, comprising administering a 
polynucleotide, polypeptide, or modulator of the invention to the subject. The subject 
can be a human or a non-human animal. The disorder can be cancer, proliferative, 
inflammatory, immune, metabolic, ulcerative, bacterial, or viral disorders. 
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[iD304] For example, the method of treatment may comprise administering an 
antibody composition with a first antibody that specifically binds to a first epitope of a 
first polypeptide or a fi-agment thereof, or that interferes with at least one activity of 
the first polypeptide or a fi-agment thereof, wherein the first polypeptide is encoded by 
a nucleic acid molecule comprising a nucleotide sequence chosen fi-om SEQ ID 
NOS.: 1 - 104, or any nucleic acid of the present invention. In certain embodiments, 
this method further comprises using a second antibody that binds specifically to or 
interferes with the activity of a second epitope of the first polypeptide or to a first 
epitope of a second polypeptide. The second polypeptide can be encoded by a nucleic 
acid molecule comprising a nucleotide sequence chosen firom SEQ ID NOS.: 1-104, 
or any nucleic acid of the present invention. In certain embodiments, the antibody 
binds, or interferes with tlie activity of, at least one polypeptide fiagment, wherein the 
firagment is an extaracellular fiagment of the polypeptide, or an extracellular firagment 
of the polypeptide minus the signal peptide, for the treatment, for example, of 
proliferative disorders, such as cancer. 

[0305] In other embodiments, the modulator may bind to a cell surface 
molecule that is over-expressed in the disorder. Further the modulator may be linked 
to an antibody of the invention. The antibody can be capable of initiating antibody 
dependent cell cytotoxicity, e.g., where the antibody is in turn coupled to cytotoxic 
agents. This method is applicable when the disorder is cancer, another proliferative 
disorder, inflammatory, inunune, bacterial, viral, or metabolic disorder, and the cell 
surface molecule is over-expressed in a cancer cell, diseased cell or virus-infected 
cell. The cell surface molecule can be a single-tiransmembrane-related protein, a 
multiple-t^smembrane-related protein, a kinase-related protein, a protein kinase- 
related protein, a ligase-related protein, a nuclear hormone receptor-related protein, a 
phosphatase-related protein, a protease-related protein, a phosphodiesterase-related 
protein, a kinesin-related protein, an immunoglobulin-related protein, a T-cell 
receptor-related protein, a glycosylphosphatidylinositol anchor-related protein, or 
other amino acid sequence, including, an activator-related protein, an adaptor-related 
protein, an adhesion molecule-related protein, an ATPase-related protein, an ATP- 
related protein, a breakpoint-related protein, a chamiel-related protein, a checkpoint- 
related protein, a complex-related protein, a dehydrogenase-related protein, a 
disintegrin-related protein, an endopeptidase-related protein, a germ-cell-related 
protein, a GTPase-related protein, a helicase-related protein, a hydrolase-related 
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protein, an integrase-related protein, an integrin-related protein, isomerase-related 
protein, a membrane-related protein, a mucin-related protein, an oxygenase-related 
protein, a peroxidase-related protein, a phopholipase-related protein, a prosaposin- 
related protein, a proteasome-related protein, a reductase-related protein, a reverse 
transcriptase-related protein, an RNase-related protein, an RNase H-related protein, an 
SH3-related protein, a synthetase-related protein, a TATA box-related protein, a Tat- 
related proteio, a transferase-related protein, a transposase-related protein, a ubiquitin- 
related protein, or virus-related protein that is over-expressed in cancer, proliferative, 
inflammatory, immune, bacterial, viral, or metabolic disorder. 

[0306] The invention also provides a method for prophylactic or therapeutic 
treatment of a subject needing or desiring such treatment by providing a vaccine, that 
can be administered to the subject. The vaccine may comprise one or more of a 
polynucleotide, polypeptide, or modulator of the invention, for example an antibody 
vaccine composition, a polypeptide vaccine composition, or a polynucleotide vaccine 
composition, useful for treating cancer, proliferative, inflammatory, immune, 
metabolic, bacterial, or viral disorders. 

[0307] For example, the vaccine can be a cancer vaccine, and the 
polypeptide can concomitantly be a cancer antigen. The vaccine may be an anti- 
inflammatory vaccine, and the polypeptide can concomitantly be an inflammation- 
related antigen. The vaccine may be a viral vaccine, and the polypeptide can 
concomitantly be a viral antigen. In some embodiments, the vaccine comprises a 
polypeptide ftagment, comprising at least one extracellular fragment of a polypeptide 
of the invention, and/or at least one extracellular fragment of a polypeptide of the 
invention minus the signal peptide, for the treatment, for example, of proUferative 
disorders, such as cancer, hi certain embodiments, the vaccine comprises a 
polynucleotide encoding one or more such fragments, administered for the treatment, 
for example, of proliferative disorders, such as cancer. Further, the vaccine can be 
administered with or without an adjuvant. 

[0308] In another aspect, the invention provides a method for gene therapy 
by providing a polynucleotide comprising a nucleic acid molecule encoding a 
polypeptide, such as an antibody of the invention, and administering the 
polynucleotide to a subject needing or desiring such treatment. 

[0309] The invention further provides a kit comprising one or more of a 
polynucleotide, polypeptide, or modulator composition, such as an antibody 
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composition, which may include instructions for its use. Such kits are useful in 
diagnostic applications, for example, to detect the presence and/or level of a 
polypeptide in a biological sample by specific antibody interaction. 

Modes FOR Carrying OUT THE Invention 
Brief Description of the Table 

[0310] Each sequence shown in Table 1 is identified by a Five Prime 
Therapeutics, Inc. (FP) identification number (FP ID). Each protein in Table 1 is also 
described by an annotation of the Fantom mouse protein with the greatest degree of 
similarity to the claimed sequences. The Fantom database was compiled by the 
Fantom Consortium and is accessible, for example, at http://fantom.gsc.riken. 
go.jp/db/ (Bono et al., 2002). It provides curated functional annotation to full-length 
mouse sequences (Okzaki et al., 2002). The similarities of the claimed sequences of 
the invention with the annotated sequences in Table 1 suggest that they may share 
structural and functional properties, and exhibit similar expression profiles and 
localizations. 
Definitions 

[0311] "Related sequences" include nucleotide and amino acid sequences 
that ate involved in the function of their referent. For example, "receptor-related 
sequences" include all sequences that are involved in receptor function. This 
includes, but is not limited to, sequences that are involved in receptor synthesis, 
receptor regulation, receptor effector function, and receptor degradation. "Related 
sequences" also encompass complementary nucleic acid sequences, and biologically 
active firagments of nucleic acid and amino acid sequences. 

[0312] The terms "polynucleotide," "nucleotide," "nucleic acid," 
"polynucleic molecule," "nucleotide molecule," "nucleic add molecule," "nucleic acid 
sequence," "polynucleotide sequence," and "nucleotide sequence" are used 
interchangeably herein to refer to polymeric forms of nucleotides of any length. The 
polynucleotides can contain deoxyribonucleotides, ribonucleotides, and/or their 
analogs or derivatives. For example, nucleic acids can be naturally occurring DNA or 
RNA, or can be synthetic analogs, as known in the art. The terms also encompass 
genomic DNA, genes, gene Augments, exons, introns, regulatory sequences or 
regulatory elements (such as promoters, enhancers, initiation and termination regions, 
other control regions, expression regulatory factors, and expression controls), DNA 
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comprising one or more single-nucleotide polymorphisms (SNPs), allelic variants, 
isolated DNA of any sequence, and cDNA. The terms also encompass mRNA, tRNA, 
rRNA, ribozymes, splice variants, antisense RNA, antisense conjugates, RNAi, and 
isolated RNA of any sequence. The tenns also encompass recombinant 
polynucleotides, heterologous polynucleotides, branched polynucleotides, labeled 
polynucleotides, hybrid DNA/RNA, polynucleotide constructs, vectors comprising the 
subject nucleic acids, nucleic acid probes, primers, and primer pairs. The 
polynucleotides can pamprise modified nucleic acid molecules, with alterations in the 
backbone, sugars,' or heterocyclic bases, such as methylated nucleic acid molecules, 
peptide nucleic adds, and nucleic acid molecule analogs, which may be suitable as, 
for example, probes if they demonstrate superior stability and/or binding affinity 
under assay conditions. Analogs of purines and pyrimidines, including radiolabeled 
and fluorescent analogs, are known in tiie art. The polynucleotides can have any 
three-dimensional structure, and can perform any fimction, known or as yet unknown. 
The terms also encompass single-stranded, double-stranded and triple helical 
molecules that are either DNA, RNA, or hybrid DNA/RNA and that may encode a 
full-length gene or a biologically active fragment thereof. Biologically active 
fragments of polynucleotides can encode the polypeptides herein, as well as anti-sense 
and RNAi molecules. Thus, the fliU length polynucleotides herein may be treated 
with enzymes, such as Dicer, to generate a library of short RNAi firagments which are 
within the scope of the present invention. 

[03 1 3] The novel polynucleotides herein include those shown in the Table, 
SEQ ID NOS. : 1-104, and biologically active fragments thereof. The 
polynucleotides also include modified, labeled, and degenerate variants of the nucleic 
acid sequences, as well as nucleic acid sequences that are substantially similar or 
homologous to nucleic acids encoding the subject proteins. 

[03 14] A "biologically actiye" entity, or an entity having "biological 
activity," is one having structural, regulatory, or biochemical functions of a naturally 
occurring molecule or any function related to or associated with a metabolic or 
physiological process. Biologically active polynucleotide fragments are those 
exhibiting activity similar, but not necessarily identical, to an activity of a 
polynucleotide of the present invention. The biological activity can include an 
improved desired activity, or a decreased undesirable activity. For example, an entity 
demonstrates biological activity when it participates in a molecular interaction with 
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another molecule, or when it has therapeutic value in alleviating a disease condition, 
or when it has prophylactic value in inducing an immune response to the molecule, or 
when it has diagnostic value in determining the presence of the molecule, such as a 
biologically active fragment of a polynucleotide that can be detected as unique for the 
polynucleotide molecule, or that can be used as a primer in PGR. 

[0315] The term "degenerate variant" of a nucleic acid sequence refers to all 
nucleic acid sequences that can be directly translated, according to the standard 
genetic code, to provide an amino acid sequence identical to that translated from a 
reference nucleic acid sequence. 

[03 1 6] The term "gene" or "genomic sequence" as used herein is an open 
reading frame encoding specific proteins and polypeptides, for example, an mRNA, 
cDNA, or genomic DNA, and also may or may not include intervening introns, or 
adjacent 5 'and 3 'non-coding nucleotide sequences involved in the regulation of 
expression up to about 20 kb beyond the coding region, and possibly ftirther in either 
direction. A gene can be introduced into an appropriate vector for extrachromosomal 
maintenance or for integration into a host genome. 

[03 1 7] The term "transgene" as used herein is a nucleic acid sequence that is 
incorporated into a transgenic organism. A "transgene" can contain one or more 
transcriptional regulatory sequences, and other sequences, such as introns, that may be 
useful for expressing or secreting the nucleic acid or fusion protein it encodes. 

[03 1 8] The term "cDNA" as used herein is intended to include all nucleic 
acids that share the sequence elements of mature mRNA species, where sequence 
elements are exons and 3 ' and 5 ' non-coding regions. Generally, mRNA species have 
contiguous exons, the intervening introns having been removed by nuclear RNA 
splicing to create a continuous open reading frame encoding a protein. 

[03 1 9] The term "splice variant" refers to all types of RNAs transcribed from 
a given gene that when processed collectively encode plural protein isoforms. The 
term "alternative splicing" and related terms refer to all types of RNA processing that 
lead to expression of plural protein isoforms from a single gene. Some genes are first 
transcribed as long mRNA precursors tiiat are then shortened by a series of processing 
steps to produce the mature mRNA molecule. One of these steps is RNA splicing, in 
which the intron sequences are removed from the mRNA precursor. A cell can splice 
the primary transcript in different ways, making different "splice variants," and 
thereby making different polypeptide chains from the same gene, or from the same 
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mRNA molecule. Splice variants can include, for example, exon insertions, exon 
extensions, exon truncations, exon deletions, alternatives in the 5 'untranslated region 
and alternatives in the 3 'untranslated region. 

[0320] "Oligonucleotide" may generally refer to polynucleotides of between 
about 5 and about 100 nucleotides of single-or double-stranded nucleic acids. For the 
purposes of this disclosure, there is no upper limit to the length of an oligonucleotide. 
Oligonucleotides are also known as oligomers or oligos and can be isolated from 
genes, or chemically.synthesized by methods known in the art. 

[0321 ] "Nucleic acid composition" as used herein is a composition 
comprising a nucleic acid sequence, including one having an open reading frame that 
encodes a polypeptide and is capable, under appropriate conditions, of being 
expressed as a polypeptide. The term includes, for example, vectors, including 
plasmids, cosmids, viral vectors (e.g., retrovmis vectors such as lentivirus, 
adenovirus, and the like), human, yeast, bacterial, Pl-derived artificial chromosomes 
(HAC's, YAC's, BAG'S, PAC's, etc), and mini-chromosomes, in vitro host cells, in 
vivo host cells, tissues, organs, allogenic or congenic grafts or transplants, 
multicellular organisms, and chimeric, genetically modified, or transgenic animals 
comprising a subject nucleic acid sequence. 

[0322] An "isolated," "purified," or "substantially isolated" polynucleotide, 
or a polynucleotide in "substantially pure form," in "substantially purified form," in 
"substantial purity," or as an "isolate," is one that is substantially free of the sequences 
with which it is associated in nature, or other nucleic add sequences that do not 
include a sequence or fragment of the subject polynucleotides. By substantially free 
is meant that less than about 90%, less than about 80%, less than about 70%, less than 
about 60%, or less than about 50% of the composition is made up of materials other 
than the isolated polynucleotide. For example, the isolated polynucleotide is at least 
about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 
90%, at least about 95%, at least about 97%, or at least about 99%. free of the 
materials with which it is associated in nature. For example, an isolated 
polynucleotide may be present in a composition wherein at least about 50%, at least 
about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 
95%, at least about 97%, at least about 99% of the total macromolecules (for example, 
polypeptides, fragments thereof, polynucleotides, fragments thereof, lipids, 
polysaccharides, and oligosaccharides) in the composition is the isolated 
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polynucleotide. Where at least about 99% of the total macromolecules is the isolated 
polynucleotide, the polynucleotide is at least about 99% pure, and the composition 
comprises less than about 1% contaminant. As used herein, an "isolated," "purified" 
or "substantially isolated" polynucleotide, or a polynucleotide in "substantially pure 
form," in "substantially purified form," in "substantial purity," or as an "isolate," also 
refers to recombinant polynucleotides, modified, degenerate and homologous 
polynucleotides, and chemically synthesized polynucleotides, which, by virtue of 
origin or manipulation, are not associated with all or a portion of a polynucleotide 
with which it is associated in nature, are linked to a polynucleotide other than that to 
which it is linked in nature, or do not occur in nature. For example, the subject 
polynucleotides are generally provided as other than on an intact chromosome, and 
recombinant embodiments are typically flanked by one or more nucleotides not 
normally associated with the subject polynucleotide on a naturally-occurring 
chromosome. 

[0323] The terms "polypeptide," "peptide," and "protein," used 
interchangeably herein, refer to a polymeric form of amino acids of any length, wliich 
can include naturally-occurring amino acids, coded and non-coded amino acids, 
chemically or biochemically modified, derivatized, or designer amino acids, amino 
acid analogs, peptidomimetics, and depsipeptides, and polypeptides having modified, 
cyclic, bicyclic, depsicyclic, or depsibicyclic peptide backbones. The term includes 
single chain protein as well as multimers. The term also includes conjugated proteins, 
fusion proteins, including, but not limited to, GST fusion proteins, fusion proteins 
with a heterologous amino acid sequence, fusion proteins with heterologous and 
homologous leader sequences, fusion proteins with or without N-terminal methionine 
residues, pegolyated proteins, and immunologically tagged proteins. Also included in 
this term are variations of naturally occurring proteins, where such variations are 
homologous or substantially similar to the naturally occurring protein, as well as 
corresponding homologs fi-om different species. Varimits of polypeptide sequences 
include iasertions, additions, deletions, or substitutions compared with the subject 
polypeptides. The term also includes peptide aptamers. 

[0324] The novel polypeptides herein include amino acid sequences encoded 
by an open reading frame (ORF), described in greater detail below, including the full 
length protein and fragments thereof, particularly biologically active firagments and/or 
fragments corresponding to functional domains, e.g., a signal peptide or leader 
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sequence, an enzyme active site, including a cleavage site and an enzyme catalytic 
site, a domain for interaction with other protein(s), a domain for binding DNA, a 
regulatory domain, a consensus domain that is shared with other members of the same 
protein family, such as a kmase family or an unmunoglobulin family; an extracellular 
domain that may act as a target for antibody production or that may be cleaved to 
become a soluble receptor or a ligand for a receptor; an intracellular fragment of a 
transmembrane protein that participates in signal transduction; a transmembrane 
domain of a.transmetnbrane protein that may facilitate water or ion transport; a 
sequence associated with cell survival and/or cell proliferation; a sequence associated 
with cell cycle arrest, DNA repair and/or apoptosis; a sequence associated with a 
disease or disease prognosis, mcluding types of cancer, degenerative disease, 
inflammatory disease, immunological disease, genetic disease, metabolic disease, 
and/or viral infection; and including fusions of the subject polypeptides to other 
proteins or parts, thereof; modifications of the subject polypeptide, e.g., comprising 
modified, derivatized, or designer amino acids, modified peptide backbones, and/or 
immunological tags; as well as intra- and inter-species homologs of the subject 
polypeptides. 

[0325] As noted above, a "biologically active" entity, or an entity having 
"biological activity," is one having structxiral, regulatory, or biochemical fiinctions of 
a natui-ally occurring molecule or any function related to or associated with a 
metabolic or physiological process. Biologibally active polypeptide fragments are 
those exhibiting activity similar, but not necessarily identical, to an activity of a 
polypeptide of the present invention. The biological activity can include an improved 
desu-ed activity, or a decreased undesirable activity. For example, an entity 
demonstrates biological activity when it participates in a molecular interaction with 
another molecule, or when it has therapeutic value in alleviating a disease condition, 
or when it has prophylactic value in inducing an immune response to the molecule, or 
when it has diagnostic value in determining the presence of the molecule. A 
biologically active polypeptide or Augment thereof includes one that can participate in 
a biological reaction, for example, as a transcription factor that combines with other 
transcription factors for initiation of tiiBiiscription, or that can serve as an epitope or 
immunogen to stimulate an immune response, such as production of antibodies, or 
that can transport molecules into or out of cells, or that can perform a catalytic 
activity, for example polymerization or nuclease activity, or that can participate in 
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signal transduction by binding to receptors, proteins, or nucleic acids, activating 
enzymes or substrates. 

[0326] A "signal peptide," or a "leader sequence," comprises a sequence of 
amino acid residues, typically, at the N terminus of a polypeptide, which directs the 
intracellular trafficking of the polypeptide. Polypeptides that contain a signal peptide 
or leader sequence typically also contain a signal peptide or leader sequence cleavage 
site. Such polypeptides, after cleavage at the cleavage sites, generate mature 
polypeptides, for example, after extracellular secretion or after being directed to the 
appropriate intracellular compartment. 

[0327] "Depsipeptides" are compounds containing a sequence of at least two 
alpha-amino acids and at least one alplia-hydroxy carboxylic acid, which are bound 
through at least one normal peptide link and ester links, derived ftom the hydroxy 
carboxyHc acids. "Linear depsipeptides" can comprise rings formed through S-S 
bridges, or through an hydroxy or a mercapto group of an hydroxy-, or mercapto- 
amino acid and the carboxyl group of another amino- or hydroxy-acid but do not 
comprise rings formed only through peptide or ester links derived from hydroxy 
carboxylic acids. "Cyclic depsipeptides" are peptides containing at least one ring 
formed only tlirough peptide or ester links, derived from hydroxy carboxylic acids. 

[0328] An "isolated," "purified," or "substantially isolated" pol^Teptide, or a 
polypeptide in "substantially pure form," in "substantially purified form," in 
"substantial purity," or as an "isolate," is one tiiat is substantially free of tlie materials 
with which it is associated in nature or other polypeptide sequences that do not 
include a sequence or fragment of the subject polypeptides. By substantially free is 
meant that less than about 90%, less than about 80%, less than about 70%, less than 
about 60%, or less than about 50% of the composition is made up of materials other 
than the isolated polypeptide. For example, the isolated polypeptide is at least about 
50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at 
least about 95%, at least about 97%, or at least about 99% free of the materials with 
which it is associated in nature. For example, an isolated polypeptide may be present 
in a composition wherein at least about 50%, at least about 60%, at least about 70%, 
at least about 80%, at least about 90%, at least about 95%, at least about 97%, or at 
least about 99% of the total macromolecules (for example, polypeptides, fragments 
thereof, polynucleotides, fragments thereof, lipids, polysaccharides, and 
oligosaccharides) in the composition is the isolated polypeptide. Where at least about 
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99% of the total macromolecules is the isolated polypeptide, the polypeptide is at least 
about 99% pure, and the composition comprises less than about 1% contaminant. As 
used herein, an "isolated," "purified," or "substantially isolated" polypeptide, or a 
polypeptide in "substantially pure form," in "substantially purified form," in 
"substantial purity," or as an "isolate," also refers to recombinant polypeptides, 
modified, tagged and fusion polypeptides, and chemically synthesized polypeptides, 
which by virtue or oiigjn or manipulation, are not associated with all or a portion of 
the materials with which they are associated in nature, are linked to molecules other 
than that to which they are linked in nature, or do not occur in nature. 

[0329] Detection methods of the invention can be qualitative or quantitative. 
Thus, as used herein, the terms "detection," "identification," "determination," and the 
like, refer to both qualitative and quantitative determinations, and include 
"measuring." For example, detection methods include methods for detecting the 
presence and/or level of polynucleotide or polypeptide in a biological sample, and 
methods for detecting the presence and/or level of biological activity of 
polynucleotide or polypeptide in a sample. 

[0330] As used herein, the term "array" or "microarray" may be used 
interchangeably and refers to a collection of plural biological molecules such as 
nucleic acids, polypeptides, or antibodies, having locatable addresses that may be 
separately detectable. Generally, "microarray" encompasses use of sub microgram 
quantities of biological molecules. The biological molecules may be affixed to a 
substrate or may be in solution or suspension. The substrate can be porous or solid, 
planar or non-planar, unitary or distributed, such as a glass slide, a 96 well plate, with 
or without the use of microbeads or nanobeads. As such, the term "microarray" 
includes all of the devices referred to as microarrays in Schena, 1999; Bassett et al., 
1999; Bowtell, 1999; Brown and Botstein, 1999; Chakravarti, 1999; Cheung et al., 
1999; Cole et al., 1999; Collins, 1999; Debouck and Goodfellow, 1999; Duggan et al., 
1999; Hacia, 1999; Lander, 1999; Lipshutz et al., 1999; Southern, et al., 1999; 
Schena, 2000; Brenner et al, 2000; Lander, 2001 ; Steinhaur et al., 2002; and Espejo et 
al, 2002. Nucleic acid microarrays include both oligonucleotide arrays (DNA chips) 
containing expressed sequence tags ("ESTs") and arrays of larger DNA sequences 
representing a plurality of genes bound to the substrate, either one of which can be 
used for hybridization studies. Protein and antibody microarrays include arrays of 
polypeptides or proteins, including but not limited to, polypeptides or proteins 
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obtained by purification, fusion proteins, and antibodies, and can be used for specific 
binding studies (Zhu and Snyder, 2003; Houseman et al., 2002; Schaeferling et al., 
2002; Weng et al., 2002; Winssinger et al., 2002; Zhu et al., 2001; Zhu et al. 2001; 
aad MacBeath and Sclii-eiber, 2000). 

[033 1] A "nucleic acid hybridization reaction", is one in which single strands 
of DNA or RNA randomly collide with one another, and bind to each other only when 
their nucleotide sequences have some degree of complementarity. The solvent and 
temperature conditions can be varied in the reactions to modulate the extent to which 
the molecules can bind to one another. Hybridization reactions can be performed 
under different conditions of "stringency." The "stringency" of a hybridization 
reaction as used herein refers to the conditions (e.g., solvent and temperature 
conditions) under which two nucleic acid strands will either pair or fail to pair to form 
a "hybrid" helix. 

[0332] "Tm" is the temperature in degrees Celsius at which 50% of a 
polynucleotide duplex made of complementary strands of nucleic acids that are 
hydrogen bonded in an anti-parallel direction by Watson-Crick base pairing dissociate 
into single strands under conditions of the hybridization reaction. Tn, can be predicted 
according to a standard formula, such as: = 81.5 + 16.6 log[X''] + 0.41 (%G/C) - 
0.61 (%F) - 600/L, where [X^] is the cation concentration (usually sodium ion, Na"^) in 
mol/L; (%G/C) is the number of G and C residues as a percentage of total residues in 
the duplex; (%F) is the percent formamide in solution (wt/vol); and L is the number of 
nucleotides in each strand of the paired nucleic acids. 

[0333] A "buffer" is a system that tends to resist change in pH when a given 
increment of hydrogen ion or hydroxide ion is added. Buffered solutions contain 
conjugate acid-base pairs. Any conventional buffer can be used with the inventions 
herein including but not limited to, for example, Tris, phosphate, imidazole, and 
bicarbonate. 

[0334] A "library" of polynucleotides comprises a collection of sequence 
information of a plurality of polynucleotide sequences, which information is provided 
in either biochemical form (e.g., as a collection of polynucleotide molecules), or in 
electronic form (e.g., as a collection of polynucleotide sequences stored in a 
computer-readable form, as in a computer-based system, a computer data file, and/or 
as part of a computer program). 



95 



wo 2005/005597 



PCT/US2003/027106 



[0335] A "library" of polypeptides comprises a collection of sequence 
information of a plurality of polypeptide sequences, which information is provided in, 
e.g., a collection of polypeptide sequences stored in a computer-readable form, as iii a 
computer-based system, a computer data file, and/or as part of a computer program. 

[0336] "Media" refers to a manufacture, other than an isolated nucleic acid 
molecule, that containis the sequence information of the present invention. Such a 
manufacture provides the genome sequence or a subset thereof in a form that can be 
examined by means not directly applicable to the sequence as it exists in a nucleic 
acid, e.g., with computer-readable media comprising data storage structures. Such 
media include, but are not limited to: magnetic storage media, such as a floppy disc, a 
hard disc storage medium, and a magnetic tape; optical storage media such as CD- . 
ROM; electrical storage media such as RAM and ROM; and hybrids of these 
categories such as magnetic/optical storage media. 

[0337] "Recorded" refers to a process for storing information on computer 
readable media, using any such methods as known in the art. 

[033 8] As used herein, "a computer-based system" refers to the hardware 
means, softv/are means,. and data storage means used to analyze the nucleotide 
sequence information of the present invention. The minimum hardware of the 
computer-based systems of the present invention comprises a central processing unit 
(CPU), input means, output means, and data storage means. A skilled artisan can 
readily appreciate that any one of the currently available computer-based systems are 
suitable for use in the present invention. The data storage means can comprise any 
manufacture comprising a recording of the present sequence information as described 
above, or a memory access means that can access such a manufacture. 

[0339] "Search means" refers to one or more programs implemented on the 
computer-based system, to compare a target sequence or target structural motif, or 
expression levels of a polynucleotide in a sample, with the stored sequence 
information. A variety of known algorithms are publicly known and commercially 
available, e.g., MacPattem (EMBL), BLAST, BLASTN and BLASTX (NCBI), 
gapped BLAST, BLAZE, the Wise package, FASTX, Clustalw, FASTA, FASTA3, 
AlignO, TCoffee, BestFit, FastDB, and TeraBLAST (TimeLogic, Crystal Bay, 
Nevada). Search means can be used to identify fragments or regions of the genome 
that match a particular target sequence or target motif, for example, based on 
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sequence similarity, for example, to identify open reading frames (ORFs) within Hie 
genome that contain homology to ORFs from other organisms. 

[0340] "Sequence similarity," "sequence homology," "homology," "sequence 
identity," and "percent sequence identity," used interchangeably herein, describe the 
degree of relatedness between two polynucleotide or polypeptide sequences. In 
general, "identity" means the exact match-up of two or more nucleotide sequences or 
two or more amino acid sequences, where the nucleotide or amino acids being 
compared are the same. Also, in general, "similarity" or "homology" ineans the exact 
match-up of two or more nucleotide sequences or two or more amino acid sequences, 
where the nucleotide or amino acids being compared are either the same or possess 
similar chemical and/or physical properties. The terms also refer to the percentage of 
tlie "aligned" bases (for the polynucleotides) or amino acid residues (for the 
polypeptides) that are identical when the sequences are aligned. Sequences can be 
aligned in a number of different ways and sequence similarity can be determined in a 
number of different ways. For example, the bases or amino acid residues of one 
sequence can be aligned to a gap in the other sequence, or they can be aligned only to 
another base or amino acid residue in the other sequence. A gap can range anywhere 
from one nucleotide, base, or amino acid residue to multiple axons in length, up to 
any number of nucleotides or amino acid residues. Further, sequences can be aligned 
such that nucleotides (or bases) align with nucleotides, nucleotides align with amiao 
acid residues, or amino acid residues align with amino acid residues. 

[0341] A "target sequence" can be any polynucleotide or amino acid 
sequence of six or more contiguous nucleotides or two or more amino acids, for 
example, from about 5 or from about 10 to about 100 amino acids, or from about 15 
or from about 30 to about 300 nucleotides. A variety of comparing means can be 
used to accompHsh comparison of sequence information from a sample (e.g., to 
analyze target sequences, target motifs, or relative expression levels) with the data 
storage means. A skilled artisan can readily recognize that any one of the publicly 
available homology search programs can be used as the search means for the 
computer based systems of the present invention to accomplish comparison of target 
sequences and motifs. Computer programs to analyze expression levels in a sample 
and in controls are also known in the art. A "target sequence" includes an "antibody 
target sequence," which refers to an amino acid sequence that can be used as an 
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immimogen for injection into animals for production of antibodies or for screening 
against a phage display or antibody library for identification of binding partners. 

[0342] A "target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen 
based on a three-dimensional configuration that is fonned upon the folding of the 
target motif, or on consensus sequences of regulatory or active sites. There are a 
variety of target motifs known in the art. Protein target motifs include, but are not 
limited to, enzyme active sites and signal sequences. Nucleic acid target motifs 
include, but are not limited to, hairpin structures, promoter sequences, and other 
expression elements such as binding sites for transcription factors. 

[0343] A "matrix" is a geometric network of antibody molecules and their 
antigens, as found in immunoprecipitation and flocculation reactions. An antibody 
matrix can exist in solution or on a solid phase support. 

[0344] The term "binds specifically," in the context of antibody binding, 
refers to high avidity and/or high affinity binding of an antibody to a specific 
polypeptide, or more accurately, to an epitope of a specific polypeptide. Antibody 
binding to such epitope on a polypeptide can be stronger than binding of the same 
antibody to any otlier epitopes, particularly other epitopes that can be present in 
molecules in association with, or in the same sample as the polypeptide of interest. 
For example, when an antibody binds more strongly to one epitope than to another, 
adjusting the binding conditions can result in antibody binding almost exclusively to 
the specific epitope and not to any other epitopes on the same polypeptide, and not to 
any other polypeptide, which does not comprise the epitope. Antibodies that bind 
specifically to a subject polypeptide may be capable of binding other polypeptides at a 
weak, yet detectable, level (e.g., 10% or less of the binding shown to the polypeptide 
of interest). Such weak binding, or background binding, is readily discernible from 
the specific antibody binding to a subject polypeptide, e.g., by use of appropriate 
controls. In general, antibodies of the invention bind to a specific polypeptide with a 
binding affinity of 10"^ M or greater (e.g., 10"^ M, lO"^ M, 10"'°, lO'", etc.). 

[0345] The term "host cell" includes an individual cell, cell line, cell culture, 
or in vivo cell, which can be or has been a recipient of any polynucleotides or 
polypeptides of the invention, for example, a recombinant vector, an isolated 
polynucleotide, antibody or fusion protein. Host cells include progeny of a single 
host cell, and tiie progeny may not necessarily be completely identical (in 
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morphology, physiology, or in total DNA, RNA, or polypeptide complement) to the 
original parent cell due to natural, accidental, or deliberate mutation and/or change. 
Host cells can be prokaryotic or eukaryotic, including mammalian, insect, amphibian, 
reptile, crustacean, avian, fish, plant and fungal cells. A host cell includes cells 
transformed, transfected, transduced, or infected in vivo or in vitro with a 
polynucleotide of the invention, for example, a recombinant vector. A host cell which 
comprises a recombinant vector of the invention may be called a "recombinant host 
cell." 

[0346] "Biological sample," "patient sample," "clinical sample" "sample," or 
"biological specimen," used interchangeably herein, encompasses a variety of sample 
types obtained from an individual, including biological fluids such as blood, serum, 
plasma, urine, cerebrosipinal fluid, tears, saliva, lymph, dialysis fluid, lavage fluid, 
semen, and other liquid samples or tissues of biological origin. It includes tissue 
samples and tissue cultures or cells derived therefrom and the progeny thereof, 
including cells in culture, cell supematants, and cell lysates. It includes organ or 
tissue culture derived fluids, tissue biopsy samples, tumor biopsy samples, stool 
samples, and fluids extracted from physiological tissues. Cells dissociated from solid 
tissues, tissue sections, and cell lysates are included. The definition also includes 
samples that have been manipulated in any way after their procurement, such as by 
treatment with reagents, solubilization, or enrichment for certain components, such as 
polynucleotides or polypeptides. Also included in the term are derivatives and 
fractions of biological samples. A biological sample can be used in a diagnostic, 
monitoring, or screening assay. 

[0347] The terms "individual," "host," "patient," and "subject," used 
interchangeably herein, refer to a mammal, including, but not limited to, murines, 
simians, humans, felines, canines, equines, bovines, porcines, ovines, caprines, 
mammalian farm animals, mammalian sport animals, and mammahan pets. 
"Mammals" or "mammalian," are used broadly to describe organisms which are 
within the class mammalia, including the orders carnivore (e.g., dogs and cats), 
rodentia (e.g., mice, guinea pigs, and rats), and other mammals, including cattle, 
goats, sheep, cows, horses, rabbits, and pigs, and primates (e.g., himians, 
chimpanzees, and monkeys). 

[0348] The terms "agent," "substance," "modulator," and "compound" are 
used interchangeably herein. These terms refer to a substance that binds to or 
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modulates a level or activity of a subject polypeptide or a level of mRNA encoding a 
subject protein or nucleic acid, or that modulates the activity of a cell containing the 
subject protein or nucleic acid . Where the agent modulates a level of mRNA 
encoding a subject protein, agents include ribozymes, antisense, and RNAi molecules. 
Where the agent is a substance that modulates a level of activity of a subject 
polypeptide, agents include antibodies specific for the subject polypeptide, peptide 
aptamers, small molecules, agents that bind a ligand-binding site in a subject 
polypeptide, and the .like. Antibody agents include antibodies that specifically bind a 
subject polypeptide and activate the polypeptide, such as receptor-ligand binding that 
initiates signal transduction; antibodies that specifically bind a subject polypeptide 
and inhibit binding of another molecule to the polypeptide, thus preventing activation 
of a signal transduction pathway; antibodies that bind a subject polypeptide to 
modulate transcription; antibodies that bind a subject polypeptide to modulate 
translation; as well as antibodies that bind a subject polypeptide on the surface of a 
cell to initiate antibody-dependent cytotoxicity ("ADCC") or to initiate cell killing or 
cell growth. Small molecule agents include those that bind the polypeptide to 
modulate activity of the polypeptide or cell containing the polypeptide in a similar 
fashion. The term "agent" also refers to substances that modulate a condition or 
disorder associated with a subject polynucleotide or polypeptide. Such agents include 
subject polynucleotides themselves, subject polypeptides themselves, and the like. 
Agents may be chosen from amongst candidate agents, as defined below. 

[0349] The terms "candidate agent," "subject agent," or "test agent," used 
interchangeably herein, encompass numerous chemical classes, typically synthetic, 
semi-synthetic, or naturally occurring inorganic or arganic molecules, small 
molecules, or macromolecular complexes. Candidate agents can be small organic 
compounds having a molecular weight of more than about 50 and less than about 
2,500 daltons. Candidate agents can comprise functional groups necessary for 
structural interaction with proteins, particularly hydrogen bonding, and can include at 
least an amine, carbonyl, hydroxyl or carboxyl group, and can contain at least two of 
the functional chemical groups. The candidate agents can comprise cyclical carbon or 
heterocycUc structures and/or aromatic or polyaromatic structures substituted with 
one or more of the above functional gi-oups. Candidate agents are also found among 
biomolecules, including oligonucleotides, polynucleotides, and fi-agments thereof, 
depsipeptides, polypeptides and fragments thereof, oligosaccharides, polysaccharides 
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and fragments thereof, lipids, fatty acids, steroids, purines, pyrimidines, derivatives 
thereof, structural analogs, modified nucleic acids, modified, derivatized or designer 
aimino acids, or combinations thereof 

[03 50] An "agent which modulates a biological activity of a subject 
polypeptide," as used herein, describes any substance, synthetic, semi-synthetic, or 
natural, organic or inorganic, small molecule or macromolecular, pharmaceutical or 
protein, with the capability of altering a biological activity of a subject polypeptide or 
of a fragment thereof, as described herein. Generally, a plurality of assay mixtures is 
run in parallel with different agent concentrations to obtain a differential response to 
the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detection. The 
biological activity can be measured using any assay known in the art. 

[035 1 ] An agent which modulates a biological activity of a subject 
polypeptide increases or decreases the activity at least about 10%, at least about 15%, 
at least about 20%, at least about 25%, at least about 50%, at least about 100%, or at 
least about 2-fold, at least about 5-fold, or at least about 10-fold or more when 
compared to a suitable control. 

[0352] The term "agonist" refers to a substance that mimics the function of 
an active molecule. Agonists include, but are not limited to, drugs, hormones, 
antibodies, and neurotransmitters, as well as analogues and fragments thereof 

[0353] The term "antagonist" refers to a molecule that competes for the 
binding sites of an agonist, but does not induce an active response. Antagonists 
include, but are not limited to, drugs, hormones, antibodies, and neurotransmitters, as 
well as analogues and fragments thereof. 

[0354] The term "receptor" refers to a polypeptide that binds to a specific 
extracellular molecule and may initia:te a cellular response. 

[0355] The term "ligand" refers to any molecule that binds to a specific site 
on another molecule. 

[0356] The term "modulate" encompasses an increase or a decrease, a 
stimulation, inhibition, or blockage in the measured activity when compared to a 
suitable control. "Modulation" of expression levels includes increasing the level and 
decreasing the level of an mRNA or polypeptide encoded by a polynucleotide of the 
invention when compared to a control lacking the agent being tested. In some 
embodiments, agents of particular interest are those which inhibit a biological activity 
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of a subject polypeptide, and/or which reduce a level of a subject polypeptide in a 
cell, and/or which reduce a level of a subject mRNA in a cell and/or which reduce the 
release of a subject polypeptide trom a eukaryotic cell. In other embodiments, agents 
of interest are those that increase a biological activity of a subject polypeptide, and/or 
which increase a level of a subject polypeptide in a cell, and/or which increase a level 
of a subject mRNA in a cell and/or which increase the release of a subject polypeptide 
from a eukaryotic cell. 

[0357] An ^gent that "modulates the level of expression of a nucleic acid" in 
a cell is one that brings about an increase or decrease of at least about 1.25-fold, at 
least about 1.5-fold, at least about 2-fold, at least about 5-fold, at least about 10-fold, 
or more in the level (i.e., an amount) of mRNA and/or polypeptide following cell 
contact with a candidate agent compared to a control lacking the agent. 

[0358] "Modulating a level of active subject polypeptide" includes 
increasing or decreasing activity of a subject polypeptide; increasing or decreasing a 
level of active polypeptide protein; increasing or decreasing a level of mRNA 
encoding active subject polypeptide, and increasing or decreasing the release of 
subject polypeptide for a eukaryotic cell. In some embodiments, an agent is a subject 
polypeptide, where the subject polypeptide itself is administered to an individual. In 
some embodiments, an agent is an antibody specific for a subject polypeptide. In 
some embodiments, an agent is a chemical compound such as a small molecule that 
may be usefiil as an orally available drug. Such modulation includes the recruitment 
of other molecules that directly effect the modulation. For example, an antibody that 
modulates the activity of a subject polypeptide that is a receptor on a cell surface may 
bind to the receptor and fix complement, activating the complement cascade and 
resulting in lysis of the cell. 

[03 59] The term "over-expressed" refers to a state wherein there exists any 
measurable increase over normal or baseline levels. For example, a molecule that is 
over-expressed in a disorder is one that is manifest in a measurably higher level 
compared to levels in the absence of the disorder. 

[0360] "Treatment," "treating," and the like, as used herein, refer to 
obtaining a desired pharmacologic and/or physiologic effect, covering any treatment 
of a pathological condition or disorder in a mammal, including a human. The effect 
may be prophylactic in terms of completely or partially preventing a disorder or 
symptom thereof and/or may be therapeutic in terms of a partial or complete cure for 
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a disorder and/or adverse affect attributable to the disorder. That is, "treatment" 
includes (1) preventing the disorder from occurring or recurring in a subject who may 
be predisposed to tlie disorder but has not yet been diagnosed as having it, (2) 
inhibiting the disorder, such as arresting its development, (3) stopping or terminating 
the disorder or at least symptoms associated therewith, so that the host no longer 
suffers from the disorder or its symptoms, such as causing regression of the disorder 
or its symptoms, for example, by restoring or repairing a lost, missing or defective 
function, or stimulating an inefficient process, or (4) relieving, alleviating, or 
ameliorating the disorder, or symptoms associated therewith, where ameliorating is 
used in a broad sense to refer to at least a reduction in the magnitude of a parameter, 
such as inflammation, pain, and/or tumor size. 

- [0361] A "pharmaceutically acceptable carrier," "phaimaceutically 
acceptable diluent," or "phaimaceutically acceptable excipient," or "pharmaceutically 
acceptable vehicle," used interchangeably herein, refer to a non-toxic solid, semisolid 
or liquid filler, diluent, encapsulating material or formulation auxiliary of any 
conventional type. A pharmaceutically acceptable carrier is non-toxic to recipients at 
the dosages and concentrations employed and is compatible with other ingredients of 
the formulation. For example, the carrier for a fonnulation containing polypeptides 
would not normally include oxidizing agents and other compounds that are known to 
be deleterious to polypeptides. Suitable carriers include, but are not limited to, water, 
dextrose, glycerol, saline, ethanol, and combinations thereof The carrier can contain 
additional agents such as wetting or emulsifying agents, pH buffering agents, or 
adjuvants which enhance the effectiveness of the formulation. Adjuvants of the 
invention include, but are not limited to Freunds^, Montanide ISA Adjuvants [Seppic, 
Paris, France], Ribiis Adjuvants (Ribi ImmunoChem Research, Inc., Hamilton, MT), 
Hunter's TiterMax (CytRx Corp., Norcross, GA), Aluminum Salt Adjuvants 
(Alhydrogel - Superfos of Denmark/Accurate Chemical and Scientific Co., Westbury, 
NY), Nitrocellulose-Adsorbed Protein, Encapsulated Antigens, and Gerbu Adjuvant 
(Gerbu Biotechnik GmbH, Gaiberg, Germany/C-C Biotech, Poway, CA). Topical 
carriers include liquid petroleum, isopropyl palmitate, polyethylene glycol, ethanol 
(95%), polyoxyethylene monolaurate (5%) in water, or sodium lauryl sulfate (5%) in 
water. Other materials such as anti-oxidants, humectants, viscosity stabilizers, and 
similar agents can be added as necessary. Percutaneous penetration enhancers such as 
Azone can also be included. 
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[0362] "Pharmaceutically acceptable salts" include the acid addition salts 
(foraied with the ftee amino groups of the polypeptide) and which are formed with 
inorganic acids such as, for example, hydrochloric or phosphoric acids, or such 
organic acids as acetic, mandehc, oxalic, and tartaric. Salts formed with the free 
carboxyl groups can also be derived from inorganic bases such as, for example, 
sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases 
as isopropylamine, trimiethylamine, 2-ethylamino ethanol, and histidine. 

[0363] Conjipositions for oral administration can form solutions, 
suspensions, tablets, pills, capsules, sustained release fonnulations, oral rinses, or 
powders. 

[0364] The term "unit dosage form," as used herein, refers to physically 
discrete units suitable as unitary dosages for human and animal subjects, each unit 
containing a predetermined quantity of compounds of the present invention calculated 
in an "effective amount," that is, a dosage sufficient to produce the desired result or 
effect in association with a pharmaceutically acceptable carrier. The specifications for 
the novel unit dosage forms of the present invention depend on the particular 
compound employed, the host, and the effect to be achieved, as well as the 
pharmacodynamics associated with each compound in the host. 
Compositions 

[03 65] The present invention provides novel isolated polynucleotides 
encoding polypeptides and fragments thereof. The present invention also provides 
novel isolated polypeptides, fragments thereof, and compositions comprising same. 
The present invention further provides polynucleotide compositions that can be used 
to identify the polypeptides. 

[0366] The present invention provides recombinant vectors and host cells for 
use in gene expression, primer pairs for use in hybridizations, computer-based 
embodiments for use inbioinformatics, and transgenic animals and embryonic stem 
cell lines for use in mutating and regulating gene expression. 

Nucleic Acids 

Sequences 

[0367] This invention provides genes encoding proteins, the encoded 
proteins, and fragments and homologs thereof It provides human polynucleotide 
sequences and the corresponding mouse polynucleotide sequences. 
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[0368] The nucleic acids of the subject invention can encode all or a part of 
the subject proteins. Double or single stranded fragments can be obtained from the 
DNA sequence by chemically synthesizing oligonucleotides in accordance with 
conventional methods, for example by restriction enzyme digestion or polymerase 
chain reaction (PCR) amplification The use of the polymerase chain reaction has 
been described (Saiki et al., 1985) and current techniques have been reviewed 
(Sambrook et al., 1989; McPherson et al. 2000; Dieflfenbach and Dveksler, 1995). 
For the most part, DNA fragments will be of at least about 5 nucleotides, at least 
about 8 nucleotides, at least about 10 nucleotides, at least about 15 nucleotides, at 
least about 18 nucleotides, at least about 20 nucleotides, at least about 25 nucleotides, 
at least about 30 nucleotides, or at least about 50 nucleotides, at least about 75 
nucleotides, or at least about 100 nucleotides. Nucleic acid compositions that encode 
at least six contiguous amino acids (i.e., fragments of 18 nucleotides or more), for 
example, nucleic acid compositions encoding at least 8 contiguous amino acids (i.e., 
fragments of 24 nucleotides or more), are useful in directing the expression or the 
synthesis of peptides that can be used as immunogens (Lemer, 1982; Shinnick et al., 
1983; Sutcliffe et al., 1983). , , 

[0369] In some embodiments, a polynucleotide of the invention comprises a 
nucleotide sequence of at least about 5, at least about 8, at least about 10, at least 
about 15, at least about 18, at least about 20, at least about 25, at least about 30, at 
least about 50, at least about 75, at least about 100, at least about 1 50, at least about 
200, at least about 250, at least about 300, at least about 350, at least about 400, at 
least about 450, at least about 500, at least about 550, at least about 600, at least about 
650, at least about 700, at least about 750, at least about 800, at least about 850, at 
least about 900, at least about 950, at least about 1000, at least about 1 100, at least 
about 1200, at least about 1300, at least about 1400, at least about 1500, at least about 
1600, at least about 1700, at least about 1800, at least about 1900, at least about 2000, 
at least about 2100, at least about 2200, at least about 2300, at least about 2400, at 
least about 2500, at least about 3000, at least about 4000, or at least about 5000 
contiguous nucleotides of any one of the sequences shown in SEQ ID NOS.: 1-104, or 
the coding region thereof, or a complement thereof 

[0370] In other embodiments, a polynucleotide of the invention has at least 
about 60%, 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, at least about 95%, at least about 97%, at least about 98%, or at least 
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about 99% nucleotide sequence identity with a nucleotide sequence, or a fragment 
thereof, of the coding region of any one of the sequences shown in SEQ ID NOS.: 1- 
104, or a complement thereof These sequence variants include naturally-occurring 
variants (e.g., SNPs, alleHc variants, and homologs from other species), degenerate 
variants, variants associated with disease or pathological states, and variants resulting 
from random or directed mutagenesis, as well as from chemical or other modification. 

[0371] In some embodiments, a polynucleotide of the invention comprises a 
nucleotide sequence, that encodes a polypeptide comprising an amino acid sequence of 
at least about 5, at least about 8, at least about 10, at least about 15, at least about 18, 
at least about 20, at least about 25, at least about 30, at least about 50, at least about 
75, at least about 100, at least about 150, at least about 200, at least about 250, at least 
about 300, at least about 350, at least about 400, at least about 450, at least about 500, 
at least about 550, at least about 600, at least about 650, at least about 700, at least 
about 750, at least about 800, at least about 850, at least about 900, at least about 950, 
or at least about 1000 contiguous amino acids of at least one of the sequences encoded 
by SEQ ID NOS.: 1-104. 

[0372] In some embodiment, the present invention includes the present 
polynucleotide selected from SEQ ID NOS.: 1-104, which contain 300 bp of 5' 
terminus of a protein encoding polynucleotide sequence. Such a polynucleotide is 
usefiil for the purposes of clustering gene sequences to determine gene family. 

[0373] In further embodiments, a polynucleotide of the invention hybridizes 
under stringent hybridization conditions to a polynucleotide liaving the coding region 
of any one of the sequences shown in SEQ ID NOS.: 1 - 104, or a complement 
thereof. 

[0374] The polynucleotides of the invention include those that encode 
variants of the polypeptide sequences encoded by the polynucleotides of the Sequence 
Listing. In some embodiments, these polynucleotides encode variant polypeptides 
that include insertions, additions, deletions, or substitutions compared with the 
polypeptides encoded by the nucleotide sequences shown in SEQ ID NOS.: 1-104, 
and in Table 1 . Conservative amino acid substitutions include serine/threonine, 
valine/leucine/isoleucine, asparagine/histidine/glutmnine, glutamic acid/aspartic acid, 
etc. (Gonnet et al., 1992). 

[0375] The nucleic acids of the invention include degenerate variants that 
can be translated, according to the standard genetic code, to provide an amino acid 
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sequence identical to that translated from the nucleic acid sequences herein. For 
example, synonymous codons include GGG, GGA, GGC, and GGU, each encoding 
Glycine. 

[0376] The nucleic acids of the invention include single nucleotide 
polymorphisms (SNPs), which occur frequently in eukaryotic genomes (Lander, et al. 
2001). The nucleotide sequence determined from one individual of a species can 
differ from other allelic forms present within the population. 

[0377] The nucleic acids of the Invention Include homologs of the 
polynucleotides. The source of homologous genes can be any species, e.g., primate 
species, particularly human; rodents, such as rats, hamsters, guinea pigs, and mice; 
rabbits, canines, felines; catties, such as bovihes, goats, pigs, sheep, equlnes, 
crustaceans, birds, chickens, reptiles, amphibians, fish, insects, plants, fungi, yeast, 
nematodes, etc. Among mammalian species, e.g., human and mouse, homologs have 
substantial sequence similarity, e.g., at least about 60% sequence identity, at least 
about 75% sequence identity, or at least about 80% sequence identity among 
nucleotide sequences. In many embodiments of interest, homology will be at least 
about 75%, at least about 80% ,at least about 85%, at least about 90%, at least about 
95%, at least about 97%, or at least about 98%, where in certain embodiments of 
interest homology will be as high as about 99%. 

[0378] Modifications in the native stmcture of nucleic acids, including 
alterations In the backbone, sugars or heterocyclic bases, have been shown to increase 
infracellular stability and binding affinity. Among useful changes in the backbone 
chemistry are phosphorothloates; phosphorodithloates, where both of the 
non-bridging oxygens are substituted with sulfiir; phosphoroamldltes; alkyl 
phosphotriesters and boranophosphates. Achlral phosphate derivatives include 
3'-0 -5 -S-phosphorothioate, 3 -S-5 -0- phosphorothloate, 3'-CH2-5 -0-phosphonate 
and 3 -NH-5'-0-phosphoroamidate. Peptide nucleic acids replace the entire ribose 
phosphodiester backbone with a peptide linkage. 

[0379] Sugar modifications are also used to enhance stability and affinity. 
The a-anomer of deoxyribose can be used, where the base is inverted with respect to 
the natural |3-anomer. The 2 -OH of the ribose sugar can be altered to form 2-0- 
methyl or 2 -O-allyl sugars, which provides resistance to degradation without 
comprising affinity. 
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[03 80] Modification of the heterocyclic bases must maintain prqper base 
pairing. Some usefiil substitutions include deoxyuridine for deoxythymidine; 
5-methyl-2 - deoxycytidine and 5-bromo-2 -deoxycytidine for deoxycytidine. 5- 
propynyl-2 - deoxyuridine and 5-propynyl-2 -deoxycytidine have been shown to 
increase affinity and biological activity when substituted for deoxythymidine and 
deoxycytidine, respectively. 

[038 1 ] A genomic sequence of interest comprises the nucleic acid present 
between the initiation codon and the stop codon, as defined in the listed sequences, 
including all of the introns that are nonnally present in a native chromosome. It can 
fiirther include the 3 ' and 5 ' untranslated regions found in the mature mRNA. It can 
flirther include specific transcriptional and translational regulatory sequences, such as 
promoters, enhancers, etc., including about 1 kb, about 2 kb, and possibly more, of 
flanking genomic DNA at either the 5 ' or 3 ' end of the transcribed region. The 
genomic DNA can be isolated as a fiagment of 100 kbp or smaller; and substantially 
fi'ee of flanking chromosomal sequence. The genomic DNA flanking the coding 
region, either 3 ' or 5 ', or internal regulatory sequences as sometimes found in introns, 
contains sequences required for proper tissue and stage specific expression. 

[0382] Nucleic acid molecules of the invention can comprise heterologous 
nucleic acid molecules, i.e., nucleic acid molecules other than the subject nucleic acid 
molecules, of any length. For example, the subject nucleic acid molecules can be 
flanked on the 5' and/or 3' ends by heterologous nucleic acid molecules of from about 
1 nucleotide to about 10 nucleotides, firom about 10 nucleotides to about 20 
nucleotides, fiom about 20 nucleotides to about 50 nucleotides, fix)m about 50 
nucleotides to about 100 nucleotides, firom about 100 nucleotides to about 250 
nucleotides, fiom about 250 nucleotides to about 500 nucleotides, or from about 500 
nucleotides to about 1000 nucleotides, or more in lengtii. 

[0383] The subject polynucleotides include those that encode fiision proteins 
comprising the subject polypeptides fiised to "fiision partners." For example, the 
present soluble receptor or ligand can be fiised to an immunoglobulin fi-agment, such 
as an Fc tragment for stability in circulation or to fix complement. Other polypeptide 
fragments that have equivalent capabilities as the Fc fi:agments can also be used 
herein. 

[03 84] The isolated nucleic acids of the invention can be used as probes to 
detect and characterize gross alteration in a genomic locus, such as deletions, 
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insertions, translocations, and duplications, e.g., applying fluorescence in situ 
hybridization (FISH) techniques to examine chromosome spreads (Andreeff et al., 
1999). The nucleic acids ai-e also useful for detecting smaller genomic alterations, 
such as deletions, insertions, additions, translocations, and substitutions (e.g., SNPs). 

[03 85] When used as probes to detect nucleic acid molecules capable of 
hybridizing with nucleic acids described in the Sequence Listing, the nucleic acid 
molecules can be flanked by heterologous sequences of any length. When used as 
probes, a subject nucleic acid can include nucleotide analogs that incorporate labels 
that are directly detectable, such as radiolabels or fluorophores, or nucleotide analogs 
that incorporate labels that can be visualized in a subsequent reaction, such as biotin 
or various haptens. Haptens that are commoiily conjugated to nucleotides for 
subsequent labeling include biotin, digoxigenin, and dinitrophenyl. 

[0386] Suitable fluorescent labels include fluorochromes e.g., fluorescein 
and its derivatives, e.g., fluorescein isothiocyanate (FITC6-carboxyfluorescein (6- 
FAM), 2',7 '-dimethoxy-4',5 -dichloro-6-carboxyfluorescein (JOE), ), 6-carboxy- 
2',4',7',4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM); coumarin 
and its derivatives, e.g., 7-amino-4-methylcoumarin, aminocoumarin; bodipy dyes, 
such as Bodipy FL; cascade blue; Oregon green: rhodamine dyes, e.g., rhodamine, 6- 
carboxy-X-rhodamine (ROX), Texas red, phycoerythrin, and tetramethylrhodamine; 
eosins and erythrosins; cyanine dyes, e.g., allophycocyanin, Cy3 and Cy5 or 
N,N,N',N -tetramethyl-6-carboxyrhodamine (TAMRA); macrocyclic chelates of 
lanthanide ions, e.g., quantum dye, etc; and chemiluininescent molecules, e.g., 
luciferases. 

[0387] Fluorescent labels also include a green fluorescent protein (GFP), i.e., 
a "humanized" version of a GFP, e;g., wherein codons of the naturally-occurring 
nucleotide sequence are changed to more closely match human codon bias; a GFP 
derived from Aequoria victoria or a derivative thereof, e.g., a "humanized" derivative 
such as Enhanced GFP, which are available commercially, e.g., from Clontech, Inc.; 
other fluorescent mutants of a GFP from Aequoria victoria, e.g., as described in U.S. 
Patent No. 6,066,476; 6,020,192; 5,985,577; 5,976,796; 5,968,750; 5,968,738; 
5,958,713; 5,919,445; 5,874,304; a GFP from another species such as Renilla 
reniformis, Renilla mulleri, or Ptilosarcus guernyi, as previously described (WO 
99/49019; Peelle et al., 2001), "humanized" recombinant GFP (hrGFP) (Stratagene®); 



109 



wo 2005/005597 



PCT/US2003/027106 



any of a variety of fluorescent and colored proteins ftom Anthozoan species, (e.g., 
Matzetal., 1999). 

[0388] Probes can also contain fluorescent analogs, including commercially 
available fluorescent nucleotide analogs that can readily be incorporated into a subject 
nucleic acid." These include deoxyribonucleotides and/or ribonucleotide analogs 
labeled with Cy3, Cy5, Texas Red, Alexa Fluor dyes, rhodamine, cascade blue, or 
BODIPY,andtheUke. 

[0389] Suitable radioactive labels include, e.g., ^^P, ^^S, or For 
example, probes can contain radiolabeled analogs, including those commonly labeled 
with ^^P or ^^S, such as a-^^P-dATP, -dTTP, -dCTP, and dGtP; y-^^S-GTP.and a-^^S- 
dATP, and the like. 

[0390] Nucleic acids of the invention can also be bound to a substrate. 
Subject nucleic acids can be attached covalently, attached to a surface of the support 
or applied to a derivatized surface in a chaotropic agent that facilitates denaturation 
and adherence, e.g., by noncovalent interactions, or some combination thereof. The 
nucleic acids can be bound to a substrate to which a plurality of other nucleic acids 
are concurrently bound, , hybridization to each of the plurality of the bound nucleic 
acids being separately detectable. 

[039 1] The substrate can be porous or solid, planar or non-planar, unitary or 
distributed; and the bond between the nucleic acid and the substrate can be covalent or 
non-covalent The substrate can be in the form of microbeads or nanobeads. 
Substrates include, but are not limited to, a membrane, such as nitrocellulose, nylon, 
positively-charged derivatized nylon; a solid substrate such as glass, amorphous 
silicon, crystalline silicon, plastics (including e.g., polymethylacrylic, polyethylene, 
polypropylene,' polyacrylate, polymethyhnethacrylate, polyvinylchloride, 
polytetrafluoroethylene, polystyrene, polycarbonate, polyacetal, polysulfone, cellulose 
acetate, or mixtures thereof). 

[0392] The subject nucleic acids include antisense RNA, ribozymes, and 
RNAi. Further, The nucleic acids of the invention can be used for antisense or RNAi 
inhibition of transcription or translation using methods known in the art (Phillips, 
1999a; Phillips, 1999b; Hartmann et al., 1999; Stein et al., 1998; Agrawal et al,, 
1998). 
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Expression Vectors 

[0393] The instant invention further provides host cells, e.g., recombinant 
host cells, that comprise a subject nucleic acid, host cells that comprise a recombinant 
vector, and host cells that secrete antibodies of the invention. Subject host cells can 
be cultured in vitro, or can be part of a multicellular organism. Host cells are 
described in more detail below. The instant invention further provides transgenic 
plants and non-human animals, as described in more detail below. 

[0394] In addition to the plurality of uses described in greater detail in 
following sections, the subject nucleic acids find use in the preparation of all or a 
portion of the polypeptides of tlie subject invention, as described above, using an 
expression system. For expression, an expression vector can be employed. The 
expression vector will provide a transcriptional and translational initiation region, 
which may be inducible, conditionally-active, or constitutive, or tissue-specific, where 
tlie coding region is operably linked under the transcriptional control of the 
transcriptional initiation region, and a transcriptional and translational termination 
region. These control regions can be native to a gene encoding the subject peptides, 
or can be derived fi:om heterologous or exogenous sources. 

[0395] The subject nucleic acids can also be provided as part of a vector 
(e.g., a polynucleotide construct comprising an expression cassette), a wide variety of 
which are known in the art. Vectors include, but are not limited to, plasmids; 
cosnrids; viral vectors; human, yeast, bacterial, PI -derived artificial chromosomes 
(HAC's, YAC's, BAC's, PACfe, etc.), mini-chromosomes, and the like. Vectors are 
amply described in numerous publications well known to those in the art (Ausubel, et 
al.; Jones et al., 1998a; Jones et al., 1998b). Vectors can provide for nucleic acid 
expression, for nucleic acid propagation, or both. 

[0396] A recombinant vector or construct that includes a nucleic acid of the 
invention is usefiil for propagating a nucleic acid in a host cell; such vectors are 
known as "cloning vectors." Vectors can transfer nucleic acid between host cells 
derived from disparate organisms; these are known in the art as "shuttle vectors." 
Vectors can also insert a subject nucleic acid into a host cell's chromosome; these are 
known in the art as "insertion vectors." Vectors can express either sense or antisense 
RNA transcripts of the invention in vitro (e.g., in a cell-firee system or within an in 
vitro cultured host cell) or in vivo (e.g., in a multicellular plant or animal); these are 
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known in the art as "expression vectors," which can be part of an expression system. 
Expression vectors can also produce a subject antibody. 

[0397] Vectors typically, include at least one origin of replication, at least 
one site for insertion of heterologous nucleic acid (e.g., in the form of a polylinker 
with multiple, tightly clustered, single cutting restriction endonuclease recognition 
sites), and at least one selectable marker, although some integrative vectors will lack 
an origin that is functional in the host to be chromosomally modified, and some 
vectors will lack selectable markers. Vectors are transiently or stably be maintained 
in the cells, usually for a period of at least about one day, at least about several days to 
at least about several weeks. 

[0398] Prior to vector insertion, the DNA of interest will be obtained 
substantially free of other nucleic acid sequences. The DNA can be "recombinant," 
and flmked by one or more nucleotides with which it is not normally associated on a 
naturally occurring chromosome. 

[0399] Expression vectors generally have convenient restriction sites located 
near the promoter sequence to provide for the insertion of nucleic acid sequences 
encoding heterologous protein or RNA molecules. A selectable marker operative in 
the expression system or host can be present. Expression vectors can be used for the 
production of fusion proteins, where the fusion peptide provides additional 
functionality, i.e., increased protein synthesis, a leader sequence for secretion, 
stability, reactivity with defined antisera, or an enzyme marker, e.g., p-galactosidase. 

[0400] Promoters of the invention can be naturally contiguous or not 
naturally contiguous to the expressed nucleic acid molecule, to the nucleic acid 
molecule. Promoter can be inducible, a conditionally-active (such as the cre-lox 
promoter), constitutive, and/or tissue-specific. 

[040 1 ] Expression vectors can be prepared comprising a transcription 
cassette comprising a transcription initiation region, the gene or fiagment thereof, and 
a transcriptional termination region. Of particular interest is the use of DNA 
sequences that allow for the expression of functional epitopes or domains, at least 
about 5, at least about 8, at least about 10, at least about 15, at least about 18, at least 
about 20, at least about 25, at least about 30, at least about 50, at least about 75, at 
least about 100, at least about 150, at least about 200, at least about 250, at least about 
300, at least about 350, at least about 400, at least about 450, at least about 500, at 
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least about 550, at least about 600, at least about 650, at least about 700, at least about 
750, at least about 800, at least about 850, at least about 900, at least about 950, or at 
least about 1000 amino acids in lengtli, or any of the above-described fragments, up to 
and including the complete open reading frame of the gene. After infroduction of 
these DNA sequences, the cells containing the vector construct can be selected by 
means of a selectable marker, and the selected cells expanded and used as expression- 
competent host cells. 

[0402] Host cells can comprise prokaryotes or eukaryotes that express 
proteins and polypeptides in accordance with conventional methods, the method 
depending on the purpose for expression. For large scale production of the protein, a 
unicellular organism, such as E. coli, B. subtilis, S. cerevisiae, insect cells in 
combination with baculovirus vectors, or cells of a higher organism such as 
vertebrates, particularly mammals, e.g., COS 7 cells, can be used as the expression 
host cells. In some situations, it is desirable to express eukaryotic genes in eukaryotic 

cells, where the encoded protein will benefit from native folding and post- 
translational modifications. 

[0403] Specific expression systems of interest include plants, bacteria, yeast, 

insect cells, and mammalian cell-derived expression systems. Representative systems 

from each of these categories are provided below. 

[0404] Expression systems in plants include those described in U.S. Patent 

No. 6,096,546 and U.S. Patent No. 6,127,145. 

[0405] Expression systems in bacteria include those described by Chang et 

al., 1978; Goeddel et al., 1979; Goeddel et al., 1980; EP 0 036,776; U.S. Patent No. 

4,551,433; DeBoer et al., 1983); and Siebenlist et al., 1980. 

[0406] Expression systems in yeast include those described by Hinnen et al., 

1978; Ito et al., 1983; Kurtz et al., 1986; Kunze et al., 1985; Gleeson et al., 1986; 

Roggenkamp et al., 1986; Das et al., 1984; De Louvencourt et al., 1983; Van den 

Berg et al., 1990; Kunze et al., 1985; Cregg et al., 1985; U.S. Patent Nos. 4,837,148 

and 4,929,555; Beach and Nurse, 1981 ; Davidow et al., 1985; Gaillardin et al., 1985; 

Ballance et al., 1983; Tilbum et al., 1983; Yelton et al, 1984; Kelly and Hynes, 1985; 

EP 0 244,234; WO 91/00357; and U.S. Patent No. 6,080,559. 

[0407] Expression systems for heterologous genes in insects include those 

described in U.S. PatentNo. 4,745,051; Friesen et al., 1986; EP 0 127,839; EP 0 

155,476; Vlak et al., 1988; Miller et al., 1988; Carbonell et al., 1988; Maeda et al., 
113 



wo 2005/005597 



PCT/US2003/027106 



1985; Lebacq-Verheyden et al., 1988; Smith et al., 1985); Miyajima et al., 1987; and 
Martin et al., 1988. Numerous baculoviral strains and variants and corresponding 
permissive insect host ceils are described in Luckow et al., 1988, Miller et al., 1986, 
and Maeda et al., 1985. The insect cell expression system is useful not only for 
production of heterologous proteins intracellularly, but can be used for expression of 
transmembrane proteins on the insect cell surfaces. Such insect cells can be used as 
immunogen for production of antibodies, for example, by injection of the insect cells 
into mice or rabbits pr other suitable animals, for production of antibodies. 

[0408] Mammalian expression systems include those described in Dijkema 
et al., 1985; Gorman et al., 1982; Boshart et al., 1985; and U.S. Patent No. .4,399,2 16. 
Additional features of mammalian ejqpression are facilitated as described in Ham and 
Wallace, 1979; Barnes and Sato, 1980 U.S. Patent Nos. 4,767,704, 4,657,866, 
4,927,762, 4,560,655, WO 90/103430, WO 87/00195, and U.S. RE 30,985. 
Mammalian cell expression systems can also be used for production of antibodies. 

[0409] The present polynucleotides can also be used in cell-fiee expression 
systems such as bacterial system, e.g., E. coli lysate, rabbit reticulocyte lysate system, 
wheat germ extract system, frog oocyte lysate system, and the like which is 
conventional in the art. See, for example, WO 00/68412, WO 01/27260, WO 
02/24939, WO 02/38790, WO 91/02076, and WO 91/02075. 

[041 0] When any of the above-referenced host cells, or other appropriate 
host cells or organisms, are used to replicate and/or express the polynucleotides of the 
invention, the resulting repUcated nucleic acid, RNA, expressed protein or 
polypeptide, is within the scope of the invention as a product of the host cell or 
organism. 

[041 1 ] Once the gene corresponding to a selected polynucleotide is 
identified, its ejcpression can be regulated in the gene's native cell types. For example, 
an endogenous gene of a cell can be regulated by an exogenous regulatory sequence 
inserted into the genome of the cell at a location that will enhance or reduce 
expression of the gene corresponding to the subject polypeptide. The regulatory 
sequence can be designed to integrate into the genome via homologous 
recombination, as disclosed in U.S. Patent Nos. 5,641,670 and 5,733,761, the 
disclosures of which are herein incorporated by reference. Alternatively, it can be 
designed to integrate into the genome via non-homologous recombination, as 
described in WO 99/15650, the disclosure of which is also herein incorporated by 
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reference. Also encompassed in the subject invention is the production of proteins 
without manipulating the encoding nucleic acid itself, but rather by integrating a 
regulatory sequence into the genome of a cell that already includes a gene that 
encodes the protein of interest; this production method is described in the above- 
incorporated patent documents. 
Isolated Primer Pairs 

[0412] In some embodiments, the invention provides isolated nucleic acids 
that, when used as primers in a polymerase chain reaction, amplify a subject 
polynucleotide, or a polynucleotide containing a subject polynucleotide. The 
amplified polynucleotide is from about 20 to about 50, from about 50 to about 75, 
from about 75 to about 100, from about 100 to about 125, from about 125 to about 
150, from about 150 to about 175, from about 175 to about 200, from about 200 to 
about 250, from about 250 to about 300, from about 300 to about 350, from about 350 
to about 400, from about 400 to about 500, from about 500 to about 600, from about 
600 to about 700, from about 700 to about 800, from about 800 to about 900, from 
about 900 to about 1000, from about 1000 to about 2000, from about 2000 to about 
3000, from about 3000 to about 4000, from about 4000 to about 5000, or from about 
5000 to about 6000 nucleotides or more in length. 

[0413] The isolated nucleic acids themselves are from about 1 0 to about 20, 
from about 20 to about 30, from about 30 to about 40, from about 40 to about 50, 
from about 50 to about 100, or from about 100 to about 200 nucleotides in length. 
Genei^Uy, the nucleic acids are used in pairs in a polymerase chain reaction, where 
they are referred to as "forward" and "reverse" primers. 

[041 4] Thus, in some embodiments, the invention provides a pair of isolated 
nucleic acid molecules, each from about 10 to about 200 nucleotides in lei^h, the 
first nucleic acid molecule of the pair comprising a sequence of at least 10 contiguous 
nucleotides having 100% sequence identity to a nucleic acid sequence as shown in 
SEQ ID NOS.: 1 - 104 and the second nucleic acid molecule of the pait comprising a 
sequence of at least 10 contiguous nucleotides liaving 100% sequence identity to the 
reverse complement of the nucleic acid sequence shown in SEQ ID NOS.: 1-104, 
wherein the sequence of the second nucleic acid molecule is located 3' of the nucleic 
acid sequence of the first nucleic acid molecule shown in SEQ ID NOS.: 1-104. The 
primer nucleic acids are prepared using any known method, e.g., automated synthesis, 
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and can be chosen to specifically amplify a cDNA copy of an mRNA encoding a 
subject polypeptide. 

[0415] In some embodiments, the fnst and/or the second nucleic acid 
molecules comprise a detectable label. The label can be a radioactive molecule, 
fluorescent molecule or another molecule, e.g., hapten, as described in detail above. 
Further, the label can be a two stage system, where the amplified DNA is conjugated 
to another molecule, i.e., biotin, digoxin, or a hapten, that has a high affinity binding 
partner, i.e., avidin, gntidigoxin, or a specific antibody, respectively, and the binding 
partner conjugated to a detectable label. The label can be conjugated to one or both of 
the primers. Alternatively, the pool of nucleotides used in the amplification is 
labeled, so as to incorporate tlie label into the amplification product. 

[0416] Conditions that increase stringency of both DNA/DNA and 
DNA/RNA hybridization reactions are widely known and published in the art. See, 
for example, Sambrook, 1989, and examples provided above. Examples of relevant 
conditions include (in order of increasing stringency): incubation temperatures of 
25°C, 37°C, SO^C, and 68"C; buffer concentrations of 10 x SSC, 6 x SSC, 1 x SSC, 0.1 
X SSC (where 1 x SSC is 0.15 M NaCl and 1 5 mM citrate buffer); and their 
equivalents using other buffer systems; formamide concentrations of 0%, 25%, 50%, 
and 75%; incubation times from 5 minutes to 24 hours; 1, 2, or more washing steps; 
wash incubation times of 1, 2, or 15 minutes; and wash solutions of 6 x SSC, 1 x SSC, 
0.1 X SSC, or deionized water. 

[0417] For example, "high stringency conditions" include hybridization in 
50% formamide, 5X SSC, 0.2 ng/jil poly(dA), 0.2 jig/^l human cotl DNA, and 0.5% 
SDS, in a humid oven at 42°C overnight, followed by successive washes in IX SSC, 
0.2% SDS at 55°C for 5 minutes, followed by washing at 0. IX SSC, 0.2% SDS at 
55°C for 20 minutes. Further examples of high stringency conditions include 
hybridization at 50°C and O.lxSSC (15 mM sodium chloride/1.5 mM sodium citrate); 
overnight incubation at 42°C in a solution containing 50% formamide, 1 x SSC (150 
mM NaCl, 15 mM sodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x 
Denhardt's solution, 10% dextran sulfate, and 20 ng/ml denatured, sheared salmon 
sperm DNA, followed by washing the filters in 0.1 x SSC at about 65°. High 
stringency conditions also include aqueous hybridization (e.g., free of formamide) in 
6X SSC (where 20X SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% sodium 
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dodecyl sulfate (SDS) at 65''C for about 8 hours (or more), followed by one or more 
washes in 0.2 X SSC, 0.1% SDS at es^C. Highly stringent hybridization conditions 
are hybridization conditions tliat are at least as stringent as any one of the above 
representative conditions. Other stringent hybridization conditions are Icnown in the 
art and can also be employed to identify nucleic acids of this particular embodiment 
of the invention. 

[04 1 8] Conditions of "reduced stringency, " suitable for hybridization to 
molecules encoding structurally and functionally related proteins, or otherwise 
serving related or associated functions, are the same as those for high stringency 
conditions but with a reduction in temperature for hybridization and washing to lower 
temperatures. (e.g., room temperature or about 22°C to 25°C). For example, moderate 
stringency conditions include aqueous hybridization (e.g., free of formamide) in 6X 
SSC, 1% SDS at 65°C for about 8 hours (or more), followed by one or more washes in 
2X SSC, 0.1% SDS at room temperature. Low stringency conditions include, for 
example, aqueous hybridization at 50°C and 6xSSC (0.9 M sodium chloride/0.09 M 
sodium citrate) and washing at 25°C in IxSSC (0.15 M sodium chloride/0.015 M 
sodium citrate). 

[041 9] The specificity of a hybridization reaction allows any smgle-stranded 
sequence of nucleotides to be labeled with a radioisotope or chemical and used as a 
probe to find a complementary strand, even in a cell or cell extract that contains 
millions of different DNA and RNA sequences. Probes of this type are widely used to 
detect the nucleic acids corresponding to specific genes, both to facilitate the 
purification and characterization of the genes after cell lysis and to localize them in 
cells, tissues, and organisms. 

[0420] Moreover, by carrying out hybridization reactions under conditions of 
"reduced stringency," a probe prepared from, one gene can be used to find 
homologous evolutionary relatives - both in the same organism, where the relatives 
form part of a gene family, and in other organisms, where the evolutionary history of 
the nucleotide sequence can be traced. A person skilled in the art would recognize 
how to modify the conditions to achieve the requisite degree of stringency for a 
particular hybridization. 
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Libraries 

[0421 ] The polynucleotide libraries of the invention generally comprise a 
collection of sequence' information of a plurality of polynucleotide sequences, where 
at least one of the polynucleotides has a sequence shown in SEQ ID NOS.: 1-104. 
By plurality is meant at least 2, at least 3, or at least all of the sequences in the 
Sequence Listing. The information may be provided in either biochemical form (e.g., 
as a collection of polynucleotide molecules), or in electronic form (e.g., as a 
collection of pplynupleotide sequences stored in a computer-readable form, as in a 
computer-based system, a computer data file, and/or as a part of a computer program). 
The length and number of polynucleotides in the library will vary Avith the nature of 
the library, e.g., if tiie library is an oligonucleotide array, a cDNA array, or a 
computer database of the sequence information. 

[0422] The sequence information contained in either a biochemical or an 
electronic library of polynucleotides can be used in a variety of ways, e.g., as a 
resource for gene discovery, as a representation of sequences expressed in a selected 
cell type (e.g., cell type markers), or as markers of a given disorder or disease state. 
In general, a disease marker is a representation of a gene product that is present in all 
cells affected by disease either at an increased or decreased level relative to a normal 
cell (e.g., a cell of the same or similar type that is not substantially affected by 
disease). For example, a polynucleotide sequence in a library can be a polynucleotide 
that represents an mKNA, polypeptide, or other gene product encoded by the 
polynucleotide, that is either over-expressed or under-expressed in one cell compared 
to another (e.g., a first cell type compared to a second cell type; a normal cell 
compared to a diseased cell; a cell not exposed to a signal or stimulus compared to a 
cell exposed to that signal or stimulus; and the like). 

[0423] The nucleotide sequence information of the library can be embodied 
in any suitable form, e.g., electronic or biochemical forms. For example, a library of 
sequence information embodied in electronic form comprises an accessible computer 
data file that may contain the representative nucleotide sequences of genes that are 
differentially expressed (e.g., over-expressed or under-expressed) as between, e.g., a 
first cell type compared to a second cell type (e.g., expression in a brain cell compared 
to expression in a kidney cell); a normal cell compared to a diseased cell (e.g., a non- 
cancerous cell compared to a cancerous cell); a cell not exposed to an internal or 
external signal or stimulus compared to a cell exposed to that signal or stimulus (e.g., 
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a cell contacted with a ligand compared to a control cell not contacted with the 
ligand); and the like. Other combinations and comparisons of cells will be readily 
apparent to the ordinarily skilled artisan. Biochemical embodiments of the library 
include a collection of nucleic acid molecules that have the sequences of the genes in 
the library, where the nucleic acids can correspond to the entire gene in the library or 
to a fiagment thereof, as described in greater detail below. 

[0424] Where the library is an electronic library, the nucleic acid sequence 
information can be present in a variety of media. For example, the nucleic acid 
sequences of any of the polynucleotides shown in SEQ ID NOS.: 1 - 104 can be 
recorded on computer readable media of a computer-based system, e.g., any medium 
that can be read and accessed directly by a computer. One of skill in the art can 
readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising a recording of the present sequence 
information. Any convenient data storage structure can be chosen, based on the 
means used to access the stored information. A variety of data processor programs 
and formats can be used for storage, e.g., word processing text file, database format, 
etc. In addition to tlie sequence information, electronic versions of the libraries of the 
invention can be provided in conjunction or connection with other computer-readable 
information and/or other types of computer-based files (e.g., searchable files, 
executable files, etc, including, but not limited to, for example, search program 
software, etc.). 

[0425] By providing the nucleotide sequence in computer readable form in a 
computer-based system, the information can be accessed for a variety of purposes. 
Computer software to access sequence information is publicly available. 
Conventional bioinformatics tools can be utilized to analyze sequences to determine 
sequence identity, sequence similarity, and gap information. For example, the gapped 
BLAST (Altschul et al., 1990, Altschul et al., 1997), and BLAZE (Brutlag et al., 
1993) search algorithms on a Sybase system, or the TeraBLAST (TimeLogic, Crystal 
Bay, Nevada) program optionally running on a specialized computer platform 
available from TimeLogic, can be used to identify open reading frames (ORFs) within 
the genome that contain homology to ORFs from other organisms. Homology 
between sequences of interest can be determined using the local homology algorithm 
of Smith and Waterman, 1981, as well as the BestFit program (Rechid et al., 1989), 
and the FastDB algorithm (FastDB, 1988; described in Current Methods in Sequence 
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Comparison and Analysis, Macromolecule Sequencing and Synthesis, Selected 
Methods and Applications, pp. 127-149, 1988, Alan R. Liss, Inc). 

[0426] Alignment programs that permit gaps in the sequence include 
Clustalw (Thompson et al., 1994), FASTA3 (Pearson, 2000) AlignO (Myers and 
Miller, 1988), and TGoffee (Notredame et al., 2000). Other methods for comparing 
and aligning nucleotide and protein sequences include, for example, BLASTX 
(NCBI), the Wise package (Bimey and Durbin, 2000), and FASTX (Pearson, 2000). 
These algorithms determine sequence homology between nucleotide and protein 
sequences without translating the nucleotide sequences into protein sequences. Other 
techniques for alignment are also known in the art (Doolittle, et al., 1996; BLAST, 
available from the National Center for Biotechnology Information; F ASTA, available 
in the Genetics Computing Group (GCG) package, from Madison, Wisconsin, USA, a 
wholly owned subsidiary of Oxford Molecular Group, Inc.; Schlessinger, 1988a; 
Schlessinger, 1988b; and Needleman and Wunch, 1970). 

[0427] Sequence similarity is calculated based on a reference sequence, 
wliich may be a subset of a larger sequence, such as a conserved motif, coding region, 
flanking region, etc. The reference sequence is usually at least about 18 nt long, at 
least about 30 nt long, or may extend to the complete sequence that is being 
compared. 

[0428] One parameter for determining percent sequence identity is the 
percentage of the alignment in the region of strongest alignment between a target and 
a query sequence. Methods for determining this percentage involve, for example, 
counting the number of aligned bases of a query sequence in the region of strongest 
alignment and dividing this number by the total number of bases in the region. For 
example, 10 matches divided by 1 1 total residues gives a percent sequence identity of 
approximately 90.9%. The length of the aligned region is typically at least about 
55%, at least about 58%, or at least about 60% of the total sequence length, and can 
be as great as about 62%, as great as about 64%, and even as great as about 66% of 
the total sequence length. 

[0429] The present invention includes human and mouse polynucleotide and 
polypeptide sequences that are at least about 95%, at least about 96%, at least about 
97%, at least about 98%, or at least about 99% homologous to the sequences in the 
Sequence Listing, based on using the method of determining sequence identity with 
the insertion of gaps to detect the maximum degree of sequence identity. In other 
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embodiments of interest, homology will be at least about 80%, at least about 85%, or 
as high as about 90%. 

[043 0] A variety of stractural formats for the input and output means can be 
used to input and output the information in the computer-based systems of the present 
invention. One format for an output means ranks the relative expression levels of 
different polynucleotides. Such presentation provides a skilled artisan with a ranking 
of relative expression levels to determine a gene expression profile. 

[043 1 ] As discussed above, the library of the invention also encompasses 
biochemical libraries of the polynucleotides shown in SEQ ID NOS.: 1-104 and 
2463 - 3697, e.g., collections of nucleic acids representing the provided 
polynucleotides. The biochemical libraries can take a variety of forms, e.g., a solution 
of cDNAs, a pattern of probe nucleic acids stably associated with a surface of a solid 
support (i.e., an array) and the like. Of particular interest are nucleic acid arrays in 
which one or more of the polynucleotide sequences shown in SEQ ID NOS.: 1 - 104 
is represented on the array. A variety of different array formats have been developed 
and are known to those of skill in the art. The arrays of the subject invention find use 
in a variety of applications, including gene expression analysis, drug screening, 
mutation analysis, and the like, as disclosed in the herein-listed exemplary patent 
documents. 

[043 2] In addition to the above nucleic acid libraries, analogous libraries of 
polypeptides are also provided, where llie polypeptides of the library will represent at 
least a portion of the polypeptides encoded by a gene corresponding to one or more of 
the sequences shown in SEQ ID NOS.: 1 - 104. 

[0433] Further, analogous libraries of antibodies are also provided, where the 
libraries comprise antibodies or fragments thereof that specifically bind to at least a 
portion of at least one of the subject polypeptides. Further, antibody libraries may 
comprise antibodies or fragments thereof that specifically inhibit binding of a subject 
polypeptide to its ligand or substrate, or that specifically inhibit binding of a subject 
polypeptide as a substrate to another molecule. Moreover, corresponding nucleic acid 
libraries are also provided, comprising polynucleotide sequences that encode the 
antibodies or antibody firagments described above. 
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Polypeptides 

Sequences 

[0434] This invention provides novel polypeptides, and related polypeptide 
compositions. The novel polypeptides of the invention encompass proteins encoded 
by the nucleic acids having nucleotide sequences shown in SEQ ID NOS.: 1 - 104. 
The subject polypeptides are human polypeptides, fragments thereof, variants (such as 
splice variants), homologs from other species, and derivatives thereof. In particular 
embodiments, a polypeptide of the invention has an amino acid sequence substantially 
identical to the sequence of any polypeptide encoded by a polynucleotide sequence 
shown in SEQ ID NOS.: 1 - 104.. 

[0435] these polypeptides may reside within the cell, or extracellularly. 
They may be secreted from the cell, reside in the cytoplasm, in the membranes, or in 
any of the intracellular organelles, including the nucleus, mitochondria, ribosomes, or 
storage granules. 

[0436] In many embodiments, a novel polypeptide of the invention functions 
as a secreted protein, a single-transmembrane protein, a multiple-transmembrane 
protein, a kinase, a protein kinase, a ligase, a nuclear hormone receptor, a 
phosphatase, a protease, a phosphodiesterase, a kinesin, an immunoglobulin, a T-cell 
receptor, or a glycosylphosphatidylinositol anchor. A novel polypeptide of the 
invention can also possess one or more of the following ftinctions or properties: (1) 
an activator fiinctioning to regulate one or more genes by increasing the rate of 
transcription, (2) an activator functioning to positively modulate an allosteric en2yme, 
(3) an adaptor functioning to sort cargo molecules into transport vesicles, (4) an 
adaptor functioning to form a clathrin-coated vesicle, (5) an adhesion molecule 
functioning to mediate the adhesion of cells with other cells and/or the extracellular 
matrix, (6) an ATPase functioning to move ions or small molecules across a 
membrane against a chemical concentration gradient or electrical potential, (7) an 
ATPase functioning to translocate nucleotides across membranes, (8) a breaJqpoint- 
related sequence functioning as an oncoprotein, (9) a breakpoint-related sequence 
functioning as a tumor-specific antigen, (10) a channel functioning as a water channel, 
(11) a channel functioning as an ion channel, (12) a checkpoint-related sequence 
functioning at DNA damage checkpoints, (13) a checkpoint-related sequence 
functioning at replication checkpoints, (14) a checkpoint-related sequence functioning 
to initiate signal transduction cascades eliciting cell cycle arrest, DNA repair, or 
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apoptosis, (15) a complex functioning as a protein scaffold, (16) a complex 
functioning in ADP-ribosylation, ( 1 7) a dehydrogenase functioning to synthesize 
amino acids, (18) a disintegrin functioning to inhibit blood clotting, (19) a disintegrin 
functioning as a metallopeptidase, (20) a GTPase functioning as a negative regulator 
of p53, (21) a GTPase functioning to stimulate ras GTPase activity, (22) a helicase 
functioning in DNA replication, (23) a hydrolase functioning in proprionate 
metabolism, (24) an integrase functioning to integrate a DNA copy of a retroviral 
genome into a host chromosome, (25) an integrin functioning as a tumor marker, (26) 
an integrin fimctioning in cell migration, (27) an isomerase functioning as an 
immunosuppressant, (28) a membrane protein functioning as a scaffolding component 
at the cytoplasmic face of a lipid rafl, (29) a membrane protein functioning as a ligand 
for a receptor tyrosine kinase, (30) oxygenases and peroxidases functioning as 
antioxidants, (31) a phospholipase functioning in eicosanoid synthesis, (32) a 
phospholipase functioning in preserving the intestinal mucosa, (33) a prosaposin 
functioning in lipid catabolism, (34) a proteasome component fimctioning in muscle 
wasting, (35) a reductase-related sequence functioning as a coenzyme A reductase 
inhibitor, (36) a reverse transcriptase functioning as an RNA-dependent reverse 
transcriptase, (37) a reverse transcriptase functioning as a DNA-dependent reverse 
transcriptase, (38) an RNase functioning in viral assembly, (39) an RNase H 
functioning to form oligonucleotides that prime DNA synthesis, (40) an RNase H 
functioning to cleave the RNA strand of an RNA-DNA hybrid, (41) SH3 domains 
functioning in actin cytoskeletal organization, (42) SH3 domains fimctioning in signal 
transduction, (43) a synthetase functioning as an autoantigen (44) synthetases 
fimctioning in nucleotide sugar phosphate synthesis, (45) TATA boxes functioning as 
a transcription initiators, (46) tat functiomng as a transcriptional coactivator, (47) 
transferases fimctioning in signal transduction, (48) transposases functioning as gene 
transfer agents, (49) ubiquitins functioning to protect cells against tumor necrosis 
factor induced cell death, (50) proteasome components and ubiquitin functioning in 
protein degradation, (51) a virus-related sequence fimctioning to confer resistance to 
infection by viruses, (52) other sequences of the invention interacting with one or 
more proteins, (53) other sequences of the invention enzymatically modifying one or 
more proteins, (54) other sequences of the invention binding one or more small 
molecule ligands, (55) other sequences of the invention binding one or more peptides. 
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(56) other sequences of the invention binding one or more carbohydrates, and (57) 
other sequences of the invention functioning in vesicular transport. 

[0437] In some embodiments, the present novel polypeptide modulates the 
cells or tissues of animals, particularly humans, such as, for example, by stimulating, 
enhancing or inhibiting T or B cell function or the function of other hematopoeitic 
cells or bone marrow cells; modulates aduh or embryonic stem cell or precursor cell 
growth or differentiation; modulates cell function or activity of neuronal cells or other 
cells of the CNS, he^ cells, liver cells, kidney cells, lung cells, pancreatic cells, 
gastrointestinal cells, spleen cells, breast cells, prostate cells, ovarian cells, and the 
like. 

[0438] In some embodiments, a subject polypeptide is present as a multimer. 
Multimers include homodimers, homotrimers, homotetramers, and multimers that 
include more than four monomeric units. Multimers also include heteromultimers, 
e.g., heterodimers, heterotrimers, heterotetramers, etc. where the subject polypeptide 
is present in a complex with proteins other than the subject polypeptide. Where the 
muUimer is a heteromultimer, the subject polypeptide can be present in a 1:1 ratio, a 
1:2 ratio, a 2:1 ratio, or other ratio, with the other protein(s). 

[0439] In addition to the above specifically listed proteins, polypeptides 
from other species are also provided, including mammals, such as: primates, rodents, 
e.g., mice, rats, hamsters, guinea pigs; domestic animals, e.g., sheep, pig, horse, cow, 
goat, rabbit, dog, cat; and humans, as weU as non-mammalian species, e.g., avian, 
reptile and amphibian, insect, crustacean, fish, plant, fimgus, and protozoa. 

[0440] By "homolog" is meant a protein having at least about 35 %, at least 
about 40%, at least about 60%, at least about 70%, at least about 75%, at least about 
80%, at least about 85%, at least about 90%, or at least about 95%, or higher, amino 
add sequence identity to the reference polypeptide, as measured with the "GAP" 
program (part of the Wisconsin Sequence Analysis Package available through the 
Genetics Computer Group, Inc. (Madison WI)), where the parameters are: Gap 
weight: 12; length weight:4. In many embodiments of interest, homology will be at 
least about 75%, at least about 80%, or at least 85%, where in certain embodiments of 
interest, homology will be as high as about 90%. 

[0441] Also provided are polypeptides that are substantially identical to the 
at least one amino acid sequence shown in the Sequence Listing, or a fragment 
thereof, whereby substantially identical is meant that the protein has an anaino acid 
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sequence identity to the reference sequence of at least about 75%, at least about 80%, 
at least about 85%, at least about 90%, at least about 95%, at least about 97%, at least 
about 98%, or at least about 99%. 

[0442] The proteins of the subj ect invention (e.g., polypeptides encoded by 
the nucleotide sequences shown in SEQ ID NOS.: 1-104 have been separated from 
their naturally occurring environment and are present in a non-naturally occurring 
environment. In certain embodiments, the proteins are present in a composition 
where they are more concentrated than in their naturally occurring environment. For 
example, purified polypeptides are provided. 

[0443] In addition to naturally occurring proteins, polypeptides that vary 
from naturally occurring forms are also provided. Fusion proteins can comprise a 
subject polypeptide, or fragment thereof, and a polypeptide other than a subject 
polypeptide ("the fiasion partner") ftised in-frame at the N-terminus and/or C-terminus 
of the subject polypeptide, or internally to the subject polypeptide. 

[0444] Suitable fiision partners include, but are not limited to, 
immunologically detectable proteins (e.g., epitope tags, such as hemagglutinin, 
FLAG, and c-myc); polypeptides that provide a detectable signal or that serve as 
detectable markers (e.g., a fluorescent protein, e.g., a green fluorescent protein, a 
fluorescent protein from an Anthozoan species; P-galactosidase; luciferase; ere 
recombinase; and the like); polypeptides that provide a catalytic function or induce a 
cellular response; polypeptides that provide for secretion of the fusion protein from a 
eukaryotic cell; polypeptides that provide for secretion of the ftision protein from a 
prokaryotic cell; polypeptides that provide for binding to metal ions (e.g., His„, where 
n = 3-10, e.g., 6His) and structural proteins. Fusion partners can also be those that are 
able to stabilize the present polypeptide, such as polyethylene glycol ("PEG") and a 
fragment of an immunoglobulin, such as the Fc fragment of IgG, IgE, IgA, IgM, 
and/or IgD. 

[0445] Detection methods are chosen based on the detectable fiision partner. 
For example, where the ftision partner provides an immunologically recognizable 
epitope, an epitope-specific antibody can be used to quantitatively detect the level of 
polypeptide. In some embodiments, the fiision partner provides a detectable signal, 
and in these embodiments, the detection method is chosen based on the type of signal 
generated by the ftision partner. For example, where the fiision partner is a 
fluorescent protein, fluorescence is measured. 
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[0446] Where the fusion partner is an enzyme that yields a detectable 
product, the product can be detected using an appropriate means. For example, P- 
galactosidase can, depending on the substrate, yield a colored product that can be 
detected with a spectrophotometer, and the fluorescent protein luciferase can yield a 
luminescent product detectable with a luminometer. 

[0447] In some embodiments, a polypeptide of the invention comprises at 
least about 5, at least about 8, at least about 10, at least about 15, at least about 18, at 
least about 20, at lea^ about 25, at least about 30, at least about 50, at least about 75, 
at least about 100, at least about 1 50, at least about 200, at least about 250, at least 
about 300, at least about 350, at least about 400, at least about 450, at least about 500, 
at least about 550, at least about 600, at least about 650, at least about 700, at least 
about 750, at least about 800, at least about 850, at least about 900, at least about 950, 
or at least about 1000 contiguous amino acid residues of at least one of the sequences 
according to SEQ ID NOS.: 1 - 104, up to and including the entire amino acid 
sequence. 

[0448] Fragments of the subject polypeptides, as well as polypeptides 
comprising such fragments, are also provided. Fragments of polypeptides of interest 
will typically be at least about 5, at least about 8, at least about 10, at least about 15, at 
least about 18, at least about 20, at least about 25, at least about 30, at least about 50, 
at least about 75, at least about 100, at least about 150, at least about 200, at least 
about 250, or at least 300 aa in length or longer, where the fragment will have a 
stretch of amino acids that is identical to the subject protein of at least about 5, at least 
about 8, at least about 10, at least about 15, at least about 18, at least about 20, at least 
about 25, at least about 30, or at least about 50 aa in length. 

[0449] In some embodiments, fragments exhibit one or more activities 
associated with a corresponding naturally occurring polypeptide. Fragments find 
utility in generating antibodies to the full-length polypeptide; and in methods of 
screening for candidate agents that bind to and/or modulate polypeptide activity. 
Specific fragments of interest include those with en2ymatic activity, those with 
biological activity including the ability to serve as an epitope or immunogen, and 
fragments that bind to other proteins or to nucleic acids. 

[0450] The invention provides polypeptides comprising such fragments, 
including, e.g., fiision polypeptides comprising a subject polypeptide fragment fused 
in frame (directly or indirectly) to another protein (the "fusion partner"), such as the 
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signal peptide of one protein being ftised to the mature polypeptide of another protein. 
Such fusion proteins are typically made by linking the encoding polynucleotides 
together in a vector or cassette. Suitable fiasion partners include, but are not limited 
to, immunologically detectable proteins (e.g., epitope tags, such as hemagglutinin, 
FLAG, and c-myc); polypeptides that provide a detectable signal or that serve as 
detectable markers (e.g., a fluorescent protein, e.g., a green fluorescent protein, a 
fluorescent protein firom an Anthozoan species; P-galactosidase; luciferase; ere 
recombinase); polypeptides that provide a catalytic function or induce a cellular 
response; polypeptides that provide for secretion of the fiision protein fix>m a 
eukaryotic cell; polypeptides that provide for secretion of the fusion protein from a 
prokaryotic cell; polypeptides that provide for binding to metal ions (e.g., Hisn, where 
n = 3-10, e.g., 6His) and structural proteins. Fusion partners can also be those that are 
able to stabilize the present polypeptide, such as polyethylene glycol ("PEG") and a 
fragment of an immunoglobulin, such as the Fc fragment of IgG, IgE, IgA, IgM, 
and/or IgD. 

Polypeptide Preparation. 

[045 1 ] Polypeptides of the invention can be obtained from naturally- 
occurring sources or produced synthetically. The sources of naturally occurring 
polypeptides will generally depend on the species from which the protein is to be 
derived, i.e., the proteins will be derived from biological sources that express the 
proteins. The subject proteins can also be derived from synthetic means, e.g., by 
expressing a recombinant gene encoding a protein of interest in a suitable system or 
host or enhancing endogenous expression, as described in more detail above. Further, 
small peptides can be synthesized in the laboratory by techniques well known in the 
art. 

[0452] In all cases, the product can be recovered by any appropriate means 
know in the art. For example, convenient protein purification procedures can be 
employed (e.g., see Guide to Protein Purification. Deuthscher et al., 1990). That is, a 

lysate can be prepared from the original source, (e.g., a cell expressing endogenous 
polypeptide, or a cell comprising the expression vector expressing the polypeptide(s)), 
and purified using HPLC, exclusion chromatography, gel electrophoresis, or affinity 
chromatography, and the like. 

[0453] The invention thus also provides methods of producing polypeptides. 
Briefly, the methods generally involve introducing a nucleic acid construct into a host 
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cell in vitro and culturing the host cell under conditions suitable for expression, then 
harvesting the polypeptide, either from the culture medium or from the host cell, (e.g., 
by disrupting the host cell), or both, as described in detail above. The invention also 
provides methods of producing a polypeptide using cell-free in vitro 
transcription/translation methods, which are well known in the art, also as provided 
above 

[0454] Moreover, the invention provides polypeptides, including polypeptide 
fragments, as targets, for therapeutic intervention, including use in screening assays, 
for identifying agents, that modulate polypeptide level and/or activity, and as targets 
for antibody and small molecule therapeutics; for example, in tlie treatment of 
disorders. 
Methods 

[0455] The present invention provides methods of producing a subject 
polypeptide and, provides antibodies tliat specifically bind to a subject polypeptide. 
The present invention fiirther provides screening methods for identifying agents that 
modulate a level or an activity of a subject polypeptide or polynucleotide. The 
present invention thus also provides agents that modulate a level or an activity of a 
subject polypeptide or polynucleotide, as well as compositions, including 
pharmaceutical compositions, comprising a subject agent. 

[0456] The present invention further provides methods for treating disorders 
such as, for example, cancer and other proliferative disorders or conditions, 
inflammatory and immune disorders, metabolic disorders or conditions and bacterial 
or viral disorders or conditions. 

Diagnostic and Therapeutic Applications 

Screening and Diagnostic Methods 

1. Identifying Biological Molecules that Interact with a Polyp^tide 
[0457] Formation of a binding complex between a subject polypeptide and 
an interacting polypeptide or other macromolecule (e.g., DNA, RNA, lipids, 
polysaccharides, and the like) can be detected using any laiown method. Suitable 
methods include: a yeast two-hybrid system (Zhu et al., 1997; Fields and Song, 1989; 
U.S. Pat. No. 5,283,173; Chien et al. 1991); a mammalian cell two-hybrid method; a 
fluorescence resonance energy transfer (FRET) assay; a bioluminescence resonance 
energy transfer (BRET) assay; a fluorescence quenching assay; a fluorescence 
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anisotropy assay (Jameson and Sawyer, 1995); an immunological assay; and an assay 
involving binding of a detectably labeled protein to an immobilized protein. 

[0458] Immunological assays, and assays involving binding of a detectably 
labeled protein to an immobilized protein can be performed in a variety of ways. For 
example, immunoprecipitation assays can be designed such that the complex of 
protein and an interacting polypeptide is detected by precipitation with an antibody 
specific for either the protein or the interacting polypeptide. 

[0459] FRET detects formation of a binding complex between a subject 
polypeptide and an interacting polypeptide. It involves the transfer of energy firom a 
donor fluorophore in an excited state to a nearby acceptor fluorophore. For this 
transfer to take place, the donor and acceptor molecules must be in close proximity 
(e.g., less than 10 nanometers apart, usually between 10 and 100 A apart), and the 
emission spectra of the donor fluorophore must overlap the excitation spectra of the 
acceptor fluorophore. In these embodiments, a fluorescently labeled subject protein 
serves as a donor and/or acceptor in combination with a second fluorescent protein or 
dye. 

[0460] Fluorescent proteins can be produced by generating a construct 
comprising a protein and a fluorescent fusion partner. These are well-known in the 
art, as described above, including green fluorescent protein (GFP), i.e., a "humanized" 
version of a GFP, e.g., wherein codons of the naturally-occurring nucleotide sequence 
are changed to more closely match human codon bias; a GFP derived fix)m Aequoria 
victoria or a derivative thereof, e.g., a "humanized" derivative such as Enhanced GFP, 
which are available commercially, e.g., from Clontech, Inc.; other fluorescent mutants 
of a GFP from Aequoria victoria, e.g., as described in U.S. Patent No. 6,066,476; 
6,020,192; 5,985,577; 5,976,796; 5,968,750; 5,968,738; 5,958,713; 5,919,445; 
5,874,304; a GFP from another species such as Renilla reniformis, Renilla mulleri, or 
Ptilosarcus guemyi, as previously described (WO 99/49019; Peelle et al., 2001), 
"humanized" recombinant GFP (hrGFP) (Stratagene®); any of a variety of fluorescent 
and colored proteins from Anthozoan species, (e.g., Matz et al., 1999); as well as 
proteins labeled with other fluorescent dyes, fluorescein and it derivatives, e.g., 
fluorescein isothiocyanate (FITC), 6-carboxyfluorescein (6-FAM), 6-carboxy- 
2',4',7',4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM), 
2',7 -dimethoxy-4',5 -dichloro-6-carboxyfluorescein (JOE); rhodamine dyes, e.g., 
Texas red, phycoerythrin, tetramethylrhodamine, rhodamine, 6-carboxy-X-rhodamine 
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(ROX); coumarin and its derivatives, e.g., 7-aniino-4-methylcoumarin, 
aminocoumarin; bodipy dyes, such as Bodipy FL; cascade blue; Oregon green; eosins 
and erythrosins; cyaniiie dyes, e.g., allophycocyanin, Cy3, Cy5, and N,N,N',N'- 
tetramethyl-6-carboxyrhodamine (TAMRA); macrocyclic chelates of lanthanide ions, 
e.g., quantum dye, etc; and chemiluminescent molecules, e.g., luciferases. 

[0461] Fluorescent subject proteins can also be generated by producing the 
subject protein in an auxotrophic strain of bacteria which requires addition of one or 
more amino, acids in the medium for growth. A subject protein-encoding construct 
that provides for expression in bacterial cells is introduced into the auxotrophic strain, 
and the bacteria ate cultured in the presence of a fluorescent amino acid, which is 
incorporated into the subject protein produced by the bacterium. The subject protein 
is then purified ftom the bacterial culture using standard methods for protein 
purification. 

[0462] BRET is a protein-protein interaction assay based on energy transfer 
from a bioluminescent donor to a fluorescent acceptor protein. Tlie BRET signal is 
measured by the ratio of the amount of light emitted by the acceptor to the amount of 
light emitted by the donor. The ratio of these two values increases as the two proteins 
are brought into proximity. The BRET assay has been described in the literature 
(U.S. Patent Nos. 6,020,192; 5,968,750; 5,874,304; Xu, et al. 1999). BRET assays 
can be performed by analyzing transfer between a bioluminescent donor protein and a 
fluorescent acceptor protein. Interaction between the donor and acceptor proteins can 
be monitored by a change in the ratio of light emitted by the bioluminescent and 
fluorescent proteins. In this application, the subject protein serves as donor and/or 
acceptor protein. ^ 

[0463] Fluorescence anisotropy is a measurement of the rotational mobility 
of a multi-molecular complex. It can be used to generate information about the 
binding of one molecule to another, including the affinity and specificity of binding 
sites. It can be applied to polypeptides or nucleic acids of the present invention. 

[0464] Fluorescence quenching measurements are useful in detecting protein 
multimerization, such as where the subject protein interacts with at least a second 
protein and, for example, where multimerization interaction is affected by a test agent. 
As used herein, the term "multimerization" refers to formation of dimers, trimers, 
tetramers, and higher multimers of the subject protein. Whether a subject protein 
forms a complex with one or more additional protein molecules can be determined 
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using any known assay, including assays as described above for interacting proteins. 
Formation of multimers can also be detected using non-denaturing gel 
electrophoresis, where multimerized subject protein migrates more slowly than 
monomeric subject protein. Formation of multimers can also be detected using 
fluorescence quenching techniques. 

[0465] Formation of multimers can also be detected by analytical 
ultracentrifugation, for example through glycerol or sucrose gradients, and subsequent 
visualization of a subject protein in gradient fractions by Western blotting or staining 
of SDS-polyacrylamide gels. Multimers are expected to sediment at defined positions 
in such gradients. Formation of multimers can also be detected using analytical gel 
filtration, e.g., in HPLC or FPLC systems, e.g., on columns such as Superdex 200 
(Pharmacia Amersham Inc.). Multimers run at defined positions on these columns, 
and fractions can be analyzed as above. The columns are highly 
reproducible, allowing one to relate the number and position of peaks directiy to the 
multimerization status of the protein. 

2. Detecting mRNA Levels and Monitoring Gene Expression 
[0466] The present invention provides methods for detecting the presence of 
mRNA in a biological sample. The methods can be used, for example, to assess 
whether a test compound affects gene expression, either directly or indirectly. The 
present invention provides diagnostic methods to compare the abundance of a nucleic 
acid with that of a control value, either qualitatively or quantitatively, and to relate the 
value to a normal or abnormal expression pattern. 

[0467] Metiiods of measuring mRNA levels are known in the art (Pietu, 
1996; Zhao, 1995; Soares, 1997; Raval, 1994; Chalifour, 1994; Stolz, 1996; Hong, 
1982; McGraw, 1984; WO 97/27317). These metiiods generally comprise contacting 
a sample with a polynucleotide of the invention under conditions that allow 
hybridization and detecting hybridization, if any, as an indication of the presence of 
the polynucleotide of interest. Appropriate conhrols include the use of a sample 
lacking the polynucleotide mRNA of interest, or the use of a labeled polynucleotide of 
the same "sense" as a polynucleotide mRNA of interest. Detection can be 
accomplished by any known method, including, but not limited to, in situ 
hybridization, PGR, RT-PCR, and "Northern" or RNA blotting, or combinations of 
such techniques, using a suitably labeled subject polynucleotide. A variety of labels 
and labeling methods for polynucleotides are known in the art and can be used in the 
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assay methods of the invention. A common method employed is use of microarrays 
which can be purchased or customized, for example, through conventional vendors 
such as Affymetrix. 

[0468] In some embodiments, the methods involve generating a cDNA copy 
of an mRNA molecule in a biological sample, and amplifying the cDNA using an 
isolated primer pairs as described above, i.e., a set of two nucleic acid molecules that 
serve as forward and reverse primers in an amplification reaction (e.g., a polymerase 
chain reaction). The. primer pairs are chosen to specifically amplify a cDNA copy of 
an mRNA encoding a polypeptide. A detectable label can be included in the 
amplification reaction, as provided above. Methods using PGR amplification can be 
performed on the DNA from a single cell, although it is convenient to use at least 
about 10^ cells. 

[0469] The present invention provides methods for monitoring gene 
expression. Changes in a promoter or enhancer sequence that can affect gene 
expression can be examined in light of expression levels of the normal allele by 
various methods known in the art. Methods for determining promoter or enhancer 
strength include quantifying the expressed natural protein, and inserting the variant 
control element into a vector with a quantitative reporter gene such as P-galactosidase, 
luciferase, or chloramphenicol acetyltransferase (CAT). 
3. Detecting Polymorphisms and Mutations 
[04-70] Biochemical studies can determine whether a sequence 
polymorphism in a coding region or control region is associated with disease. 
Disease-associated polymorphisms can include deletion or truncation of the gene, 
mutations that alter expression level, or mutations that affect protein function, etc. A 
number of methods are available to analyze nucleic acids for the presence of a 
specific sequence, e.g., a disease associated polymorphism. Genomic DNA can be 
used when large amounts of DNA are available. Alternatively, the region of interest 
is cloned into a suitable vector and grown in sufficient quantity for analysis. Cells 
that express the gene provide a source of mRNA, which can be assayed directly or 
reverse transcribed into cDNA for analysis. The nucleic acid can be amphfied by 
conventional techniques, i.e., PCR, to provide sufficient amounts for analysis. (Saiki 
et al., 1988; Sambrook et al., 1989, pp. 14.2-14.33). Alternatively, various methods 
are known in the art that utiUze oligonucleotide ligation as a means of detecting 
polymorphisms (Riley et al., 1990; Delahunty et al., 1996). 
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[047 1 ] The sample nucleic add, e.g. , an amplified or cloned fragment, is 
analyzed by one of a nvimber of methods known in the art. The nucleic acid can be 
sequenced by dideoxy nucleotide sequencing, or other methods, and the sequence of 
bases compared to a wild-type sequence. Hybridization with the variant sequence can 
also be used to determine its presence, e.g., by Southern blots, dot blots, etc. The 
hybridization pattern of a control and variant sequence to an array of oligonucleotide 
probes immobilized on a solid support, as described in US Pat. No. 5,445,934, or 
WO 95/35505, can also be used as a means of detecting the presence of variant 
sequences. Single strand conformational polymorphism (SSCP) analysis, denaturing 
gradient gel electrophoresis (DGGE), and heteroduplex analysis in gel matrices can 
detect variation as alterations in electrophoretic mobility resulting from 
conformational changes created by DNA sequence alterations. Alternatively, where a 
polymorphism creates or destroys a recognition site for a restriction endonuclease, the 
sample can be digested with that endonuclease, and the products fractionated 
according to thek size to determine whether the fragment was digested Fractionation 
can be performed by gel or capillary electrophoresis, for example with acrylamide or 
agarose gels. 

[0472] Screening for mutations in a gene can be based on the functional or 
antigenic characteristics of the protein. Protein truncation assays are useful in 
detecting deletions that might affect the biological activity of the protein. Various 
immunoassays designed to detect polymorphisms in proteins can be used in screening. 
Where many diverse genetic mutations lead to a particular disease phenotype, 
functional protein assays have proven to be effective screening tools. The activity of 
the encoded protein can be determined by comparison with the wild-type protein. 
4. Detectim and Monitoring Polypeptide Presence and Biolosical Activity 
[0473] The present invention provides methods for detecting the presence 
and/or biological activity of a subject polypeptide in a biological sample. The assay 
used will be appropriate to the biological activity of the particular polypeptide. Thus, 
e.g., where the biological activity is an enzymatic activity, the method will involve 
contacting the sample with an appropriate substrate, and detecting the product of the 
enzymatic reaction on the substrate. Where the biological activity is binding to a 
second macromolecule, the assay detects protein-protein binding, protein-DNA 
binding, protein-carbohydrate binding, or protein-lipid binding, as appropriate, using 
well known assays. Where the biological activity is signal transduction (e.g., 
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transmission of a signal from outside the cell to insidfe the cell) or transport, an 
appropriate assay is used, such as measurement of intracellular calcium ion 
concentration, measurement of membrane conductance changes, or measurement of 
intracellular potassiimi ion concentration. 

[0474] The present invention also provides methods for detecting the 
presence or measuring the level of a normal or abnormal polypeptide in a biological 
sample using a specific antibody. The methods generally comprise contacting the 
sample with a specific antibody and detecting binding between the antibody and 
molecules of the sample. Specific antibody binding, when compared to a suitable 
control, is an indication that a polypeptide of interest is present in the sample. 
Suitable controls mclude a sample known not to contain the polypeptide, and a sample 
contacted with a non-specific antibody, e.g., an anti-idiotype antibody. 

[0475] A variety of methods to detect specific antibody-antigen interactions 
are known in the art, e.g., standard immunohistological methods, 
immunoprecipitation, enzyme immunoassay, and radioimmunoassay. The specific 
antibody can be detectably labeled, either directly or indirectly, as described at length 
herein, and cells are permeabilized to stain cytoplasmic molecules. Briefly, 
antibodies are added to a cell sample, and incubated for a period of time sufficient to 
allow binding to the epitope, usually at least about 10 minutes. The antibody may be 
labeled with radioisotopes, enzymes, fluorescers, chemiluminescers, or otlier labels 
for direct detection.. Alternatively, specific-binding pairs may be used, involving, 
e.g., a second stage antibody or reagent that is detectably-labeled, as described above. 
Such reagents and their methods of use are well known in the art 

[0476] Alternatively, a biological sample can be brought into contact with an 
immobilized antibody on a solid support or carrier, such as nitrocellulose, that is 
capable of immobilizing cells, cell particles, or soluble proteins. The antibody can be 
attached (coupled) to an insoluble support, such as a polystyrene plate or a bead. 
After contacting the sample, tiie support can then be washed with suitable buffers, 
followed by contacting with a detectably-labeled specific antibody. Detection 
methods are known in the art and will be chosen as appropriate to the signal emitted 
by the detectable label. Detection is generally accomplished in comparison to suitable 
controls, and to appropriate standards. 

[0477] The present invention fiutiier provides methods for detecting the 
presence and/or levels of enzymatic activity of a subject polypeptide in a biological 
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sample. The methods generally involve contacting the sample with a substrate that 
yields a detectable product upon being acted upon by a subject polypeptide, and 
detecting a product of the enzj^matic reaction. Further, polypeptides that are subsets 
of the complete sequences of the subject proteins may be used to identify and 
investigate parts of the protein important for function. 

[0478] The present invention further includes methods for monitoring 
activity of a polypeptide through observation of phenotypic changes in a cell 
containing such polypeptide, such as growth or differentiation, or the ability of such a 
cell to secrete a molecule that can be detected, such as through chemical methods or 
through its effect on another cell, such as cell activation. 

5. Modulating mRNA and Peptides in Biological Samples 
[0479] The present invention provides screening methods for identifying 
agents that modulate the level of a mRNA molecule of the invention, agents that 
modulate the level of a polypeptide of the invention, and agents that modulate the 
biological activity of a polypeptide of the invention. In some embodiments, the assay 
is cell-free; in others, it is cell-based. Where the screening assay is a binding assay, 
one or more of the molecules can be joined to a label, where the label can directly or 
indirectly provide a detectable signal. 

[0480] As discussed above, the invention encompasses endogenous 
polynucleotides of the invention that encode mRNA and/or polypeptides of interest. 
Again as discussed previously, the invention also encompasses exogenous 
polynucleotides that encode mRNA or polypeptides of the invention. For example, 
the polynucleotide can reside witliin a recombinant vector which is introduced into the 
cell. For example, a recombinant vector can comprise an isolated transcriptional 
regulatory sequence which is associated in nature with a nucleic acid, such as a 
promoter sequence operably linked to sequences coding for a polypeptide of the 
invention; or the transcriptional control sequences can be operably Unked to coding 
sequences for a polypeptide fusion protein comprising a polypeptide of the invention 
fused to a polypeptide that facilitates detection. 

[048 1 ] In these embodiments, the candidate agent is combined with a cell 
possessing a polynucleotide transcriptional regulatory element operably hnked to a 
polypeptide-coding sequence of interest, e.g., a subject cDNA or its genomic 
component; and determining the agent's effect on polynucleotide expression, as 
measured, for example by the level of mRNA, polypeptide, or fusion polypeptide 
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[0482] In other embodiments, for example, a recombinant vector can 
comprise an isolated polynucleotide transcriptional regulatory sequence, such as a 
promoter sequence, operably linked to a reporter gene (e.g., |3-galactosidase, CAT, 
luciferase, or other gene that can be easily assayed for expression). In these 
embodiments, the method for identifying an agent that modulates a level of 
expression of a polynucleotide in a cell comprises combining a candidate agent with a 
cell comprising a transcaiptional regulatory element operably linked to a reporter 
gene; and deteraiining the effect of said agent on reporter gene expression. 

[0483] Known methods of measuring mRNA levels can be used to identify 
agents that modulate niRNA levels, including, but not limited to, PGR with 
detectably-labeled primers. Similarly, agents that modulate polypeptide levels can be 
identified using standard methods for determining polypeptide levels, including, but 
not limited to an immunoassay such as ELISA with detectably-labeled antibodies. 

[0484] A wide variety of cell-based assays can also be used to identify 
agents that modulate evikaryotic or prokaryotic mRNA and/or polypeptide levels. 
Examples include transformed cells that over-express a cDNA construct and cells 
transformed with a polynucleotide of interest associated with an endogenously- 
associated promoter operably linked to a reporter gene. A control sample would 
comprise, for example, the same cell lacking the candidate agent. Expression levels 
are measured and compared in the test and control samples. 

[0485] The cells used m the assay are usually mammalian cells, including, 
but not limited to, rodent cells and human cells. The cells can be primary cell cultures 
or can be immortalized cell lines. Cell-based assays generally comprise the steps of 
contacting the cell with a test agent, forming a test sample, and, after a suitable time, 
assessing the agent's effect on macromolecule expression. That is, the mammalian 
cell line is transformed or transfected with a constiuct tiiat results in expression of the 
polynucleotide, tiie cell is contacted, with a test agent, and then mRNA or polypeptide 
levels are detected and measured using conventional assays 

[0486] A suitable period of time for contacting the agent with the cell can be 
determined empirically, and is generally a time sufficient to allow entiy of the agent 
into the cell and to allow the agent to have a measurable effect on subject mRNA 
and/or polypeptide levels. Generally, a suitable time is between about 10 minutes and 
about 24 hours, including about 1 to about 8 hours. Alternatively, incubation periods 
may be between about 0. 1 and about 1 hour, selected for example for optimum 
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activity or to facilitate rapid high-throughput screening. Where the polypeptide is 
expressed on the cell surface, however, a shorter length of time may be sufficient. 
Incubations are performed at any suitable temperature, i.e., between about 4°C and 
about 40°C. The contact and incubation steps can be followed by a washing step to 
remove unbound components, i.e., a label that would give rise to a background signal 
during subsequent detection of specifically-bound complexes. 

[0487] A variety of assay configurations and protocols are known in the art. 
For example, one of the components can be bound to a solid support, and the 
remaining components contacted with the support bound component. Remaining 
components may be added at different times or at substantially the same time. 
Further, where the interacting protein is a second subject protein, the effect of the test 
agent on binding can be determined by determining the effect on multimization of the 
subject protein. 

[0488] The present invention further provides methods of identifying agents 
that modulate a biological activity of a polypeptide of the invention. The method 
generally comprises contacting a test agent with a sample containing a subject 
polypeptide and assaying a biological activity of the subject polypeptide in the 
presence of the test agent. An increase or a decrease in the assayed biological activity 
in comparison to the activity in a suitable control (e.g., a sample comprising a subject 
polypeptide in the absence of the test agent) is an indication that the substance 
modulates a biological activity of the subject polypeptide. The mixture of 
components is added in any order that provides for the requisite interaction.. 

[0489] External and internal processes that can affect modulation of a 
macromolecule of the invention include, but are not limited to, infection of a cell by a 
microorganism, including, but not limited to, a bacterium (e.g., Mycobacterium spp., 
Shigella, or Chlamydia), a protozoan (e.g., Trypanosoma spp., Plasmodium spp., or 
Toxoplasma spp.), a fungus, a yeast (e.g., Candida spp.), or a virus (including viruses 
that infect mammalian cells, such as human immunodeficiency virus, foot and mouth 
disease virus, Epstein-Barr virus, and viruses that infect plant cells); change in pH of 
the medium in which a cell is maintained or a change in internal pH; excessive heat 
relative to the normal range for the cell or the multicellular organism; excessive cold 
relative to the normal range for the cell or the multicellular organism; an effector 
molecule such as a hormone, a cytokine, a chemokine, a neurotransmitter; an ingested 
or applied drug; a ligand for a cell-suiface receptor; a ligand for a receptor that exists 
137 



wo 2005/005597 



PCT/US2003/027106 



internally in a cell, e.g., a nuclear receptor; hypoxia; light; dark; sleep patterns; 
electrical charge; ion concentration of the medium in which a cell is maintained or an 
internal ion concentration, exemplary ions including sodium ions, potassium ions, 
chloride ions, calcium ions, and the like; presence or absence of a nutrient; metal ions; 
a transcription factor; mitogens, including, but not limited to, lipopolysaccharide 
(LPS), pokeweed mitogen; antigens; a tumor suppressor; and cell-cell contact and 
must be taken into consideration in the screening assay. 

[049.0] A variety of other reagents can be included in the screening assay. 
These include salts, neutral proteins, e.g., albumin, detergents, and other compounds 
that facilitate optimal binding and/or reduce non-specific or background interactions. 
Reagents that improve the efficiency of the assay, such as protease inhibitors, 
nuclease inhibitors, or anti-microbial agents, etc., can be used. 

[049 1 ] Accordingly, the present invention provides a method for identifying 
an agent, particularly a biologically active agent that modulates the level of 
expression of a nucleic acid in a cell, the method comprising: combining a candidate 
agent to be tested with a cell comprising a nucleic acid that encodes a polypeptide, 
and determining the agent's effect on polypeptide expression. 

[0492] Some embodiments will detect agents that decrease the biological 
activity of a molecule of the invention. Maximal inhibition of the activity is not 
always necessary, or even desired, in every instance to achieve a therapeutic effect. 
Agents that decrease a biological activity can find use in treating disorders associated 
with the biological activity of tiie molecule. Alternatively, some embodiments will 
detect agents that increase a biological activity. Agents that increase a biological 
activity of a molecule of the invention can fmd use in treating disorders associated 
with a deficiency in the biological activity. Agents that increase or decrease a 
biological activity of a molecule of the invention can be selected for further study, and 
assessed for physiological attributes, i.e., cellular availability, cytotoxicity, or 
biocompatibility, and optimized as required. For example, a candidate agent is 
assessed for any cytotoxic activity it may exhibit toward the cell used hi the assay 
using well-known assays, such as trypan blue dye exclusion, an MTT ([3-(4,5- 
dimethylthiazol-2-yl)-2,5-diphenyl-2 H-tetrazolium bromide]) assay, and the like. 

[0493] A variety of different candidate agents can be screened by the above 
methods. Candidate agents encompass numerous chemical classes, as described 
above. 
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[0494] Candidate agents are obtained from a wide variety of sources 
including libraries of synthetic or natural compounds. Numerous means are available 
for random and directed synthesis of a wide variety of organic compounds and 
biomolecules, including expression of randomized oligonucleotides and 
oligopeptides. For example, random peptide libraries obtained by yeast two-hybrid 
screens (Xu et al., 1997), phage libraries (Hoogenboom et al., 1998), or chemically 
generated libraries. Alternatively, libraries of natural compounds in the form of 
bacterial, fungal, plant and animal extracts axe available or readily produced, 
including antibodies produced upon immunization of an animal with subject 
polypeptides, or fragments thereof, or with the encoding polynucleotides. 
Additionally, natural or synthetically produced libraries and compounds are readily 
modified through conventional chemical, physical and biochemical means, and can be 
used to produce combinatorial libraries. Further, known pharmacological agents can 
be subjected to directed or random chemical modifications, such as acylation, 
alkylation, esterification, and amidification, etc, to produce structural analogs. 

6. Kits 

[0495] The present invention provides methods for diagnosing disease states 
based on the detected presence and/or level of polynucleotide or polypeptide in a 
biological sample, and/or the detected presence and/or level of biological activity of 
the polynucleotide or polypeptide. These detection methods can be provided as part 
of a kit. Thus, the invention further provides kits for detecting the presence and/or a 
level of a polynucleotide or polypeptide in a biological sample and/or or the detected 
presence and/or level of biological activity of the polynucleotide or polypeptide. 
Procedures using these kits can be performed by clinical laboratories, experimental 
laboratories, medical practitioners, or private individuals. 

[0496] The kits of the invention will comprise a molecule of the invention. 
The kits for detecting a polynucleotide will also comprise a moiety that specifically 
hybridizes to a polynucleotide of the invention. The polynucleotide molecule can be 
of any length. For example, it can comprise a polynucleotide of at least 6, at least 7, 
at least 8, or at least 9 contiguous nucleotides of a molecule of the invention. Kits of 
the invention for detecting a subject polypeptide will comprise a moiety that 
specifically binds to a polypeptide of the invention; the moiety includes, but is not 
limited to, a polypeptide-specific antibody. 
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[0497] The kits are useful in diagnostic applications. For example, the kit is 
useful to determine whether a given DN A sample isolated from an individual 
comprises an- expressed nucleic acid, a polymorphism, or other variant. 

[0498] Kits for detecting polynucleotides comprise a pair of nucleic acids in 
a suitable storage medium, e.g., a buffered solution, in a suitable container. The pair 
of isolated nucleic acid molecules serve as primers in an amplification reaction (e.g., a 
polymerase chain reaction). The kit can further include additional buffers, reagents 
for polymerase chain reaction (e.g., deoxynucleotide triphosphates (dNTP), a 
thermostable DNA polymerase, a solution containing Mg^'^ions (e.g., MgCh), and 
other components well known to. those skilled in the art for carrying out a polymerase 
chain reaction). The kit can fttrther include instructions for use, which may be 
provided in a variety of forms, e.g., printed information, or compact disc, and the like. 
The kit may further include reagents necessary to extract DNA from a biological 
sample and reagents for generating a cDNA copy of an mRNA. The kit may 
optionally provide additional useful components, including, but not limited to, 
buffers, developing reagents, labels, reacting surfaces, means for detections, control 
samples, standards, and inteipretive information. 

[0499] In some embodiments, a kit of the invention for detecting a 
polynucleotide, such as an mRNA encoding a polypeptide, comprises a pair of nucleic 
acids that function as "forwai-d" and "reverse" primers that specifically amplify a 
cDNA copy of the mRNA. The "forward" and "reverse" primers are provided as a 
pair of isolated nucleic acid molecules, each from about 10 to about 200 nucleotides 
in length, the first nucleic acid molecule of the pair comprising a sequence of at least 
about 10 contiguous nucleotides having 100% sequence identity to a nucleic acid 
sequence shown in from SEQ ID NOS.: 1 - 104, and the second nucleic acid molecule 
of the pair comprising a sequence of at least about 10 contiguous nucleotides having 
100% sequence identity to the reverse complement of a nucleic acid sequence shown 
in SEQ ID NOS.: 1-104, wherein the sequence of the second nucleic acid molecule 
is located 3 ' of the nucleic acid sequence of the first nucleic acid molecule. The 
primer nucleic acids are prepared using any known method, e.g., automated synthesis. 
In some embodiments, one or both members of the pair of nucleic acid molecules 
comprise a detectable label. 

[0500] Where the kit provides for polypeptide detection, it can include one 
or more specific antibodies. In some embodiments, the antibody specific to the 
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polypeptide is detectably labeled. In other embodiments; the antibody specific- to the 
polypeptide is not labeled; instead, a second, detectably-labeled antibody is provided 
that binds to the specific antibody. The kit may fUrther include blocking reagents, 
buffers, and reagents for developing and/or detecting the detectable marker. The kit 
may further include instructions for use, controls, and interpretive information. 

[0501] Where the kit provides for detecting enzymatic activity, it includes a 
substrate that provides for a detectable product when acted upon by a polypeptide of 
interest. The kit may further include reagents necessary to detect and develop the 
detectable marker. 

[05 02] The present invention provides for kits with unit doses of an active 
agent. These agents are described in more detail below. In some embodiments, the 
agent is provided in oral or injectable doses. Such kits will comprise containers 
containing the unit doses and an informational package insert describing the use and 
attendant benefits of the drugs in treating a condition of interest. 

Therapeutic Compositions 

[0503] The invention further provides agents identified using a screening 
assay of the invention, and compositions comprising the agents, subject polypeptides, 
subject polynucleotides, recombinant vectors, and/or host cells, including 
pharmaceutical compositions for therapeutic administration. The subject 
compositions can be fonnulated using well-known reagents and methods. These 
compositions can include a buffer, which is selected according to the desired use of 
the agent, polypeptide, polynucleotide, recombinant vector, or host cell, and can also 
include other substances appropriate to the intended use. Those skilled in the art can 
readily select an appropriate buffer, a wide variety of which are known in the art, 
suitable for an intended use. 

1. Excipients and Formulations 

[0504] In some embodiments, compositions are provided in formulation with 
pharmaceutically acceptable excipients, a wide variety of which are known in the art 
(Gennaro, 2000; Ansel et al., 1999; Kibbe et al., 2000). Pharmaceutically acceptable 
excipients, such as vehicles, adjuvants, carriers or diluents, are readily available to the 
public. Moreover, pharmaceutically acceptable auxiliary substances, such as pH 
adjusting and buffering agents, tonicity adjusting agents, stabilizers, wetting agents 
and the like, are readily available to the public. 
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[0505] In pharmaceutical dosage forms, the compositions of the irfvention 
can be administered in the form of their pharmaceutically acceptable salts, or they can 
also be used alone or in appropriate association, as well as in combination, with other 
pharmaceutically active compounds. The subject compositions are formulated in 
accordance to the mode of potential administration. Administration of the agents can 
be achieved in various ways, including oral, buccal, nasal, rectal, parenteral, 
intraperitoneal, intradermal, transdermal, subcutaneous, intravenous, intra-arterial, 
intracardiac, intravei^tricular, intracranial, intratracheal, and intrathecal 
administration, etc., or otherwise by implantation or inhalation. Thus, the subject 
compositions can be formulated into preparations in solid, semi-solid, liquid or 
gaseous forms, such as tablets, capsules, powders, granules, ointments, solutions, 
suppositories, injections, inhalants and aerosols. The following methods and 
excipients are merely exemplary and are in no way limiting. 

[0506] For oral preparations, the agents, polynucleotides, and polypeptides 
can be used alone or in combination with appropriate additives to make tablets, 
powders, granules or capsules, for example, with conventional additives, such as 
lactose, mannitol, com starch, or potato starch; with binders, such as crystalline 
cellulose, cellulose derivatives, acacia, com starch, or gelatins; with disintegrators, 
such as com starch, potato starch, or sodiimi carboxymethylcellulose; with lubricants, 
such as talc or magnesium stearate; and if desired, with diluents, buffering agents, 
moistening agents, preservatives, and flavoring agents. 

[0507] Suitable excipient vehicles are, for example, water, saline, dextrose, 
glycerol, ethanol, or the like, and combinations thereof. In addition, if desired, the 
vehicle can contain minor amounts of auxiliary substances such as wetting or 
emulsifying agents or pH buffering agents. Actual methods of preparing such dosage 
forms are known, or will be apparent, to those skilled in the art (Remington, 1985). 
The composition or formulation to be administered will, in any event, contain a 
quantity of the agent adequate to achieve the desked state in the subject being treated. 

[0508] The agents, polynucleotides, and polypeptides can be formulated into 
preparations for injection by dissolving, suspending or emulsifying them in an 
aqueous or nonaqueous solvent, such as vegetable or other similar oils, synthetic 
aliphatic acid glycerides, esters of higher aliphatic acids or propylene glycol; and if 
desired, with conventional additives such as solubilizers, isotonic agents, suspending 
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agents, emulsifying agents, stabilizers and preservatives. Other formulations for oral 
or parenteral delivery can also be used, as conventional in the art 

[0509] The agents, poljoiucleotides, and polypeptides can be utilized in 
aerosol formulation to be administered via inhalation. The compounds of the present 
invention can be formulated into pressurized acceptable propellants such as 
dichlorodifluoromethane, propane, nitrogen, and the like. Further, the agent, _ 
polynucleotides, or polypeptide composition may be converted to powder form for 
administration intranasally or by inhalation, as conventional in the art. . 

[05 10] Furthermore, the agents can be made into suppositories by mixing 
with a variety of bases such as emulsifying bases or water-soluble bases. The 
compounds of the present invention can be administered rectally via a suppository. 
The suppository can include vehicles such as cocoa butter, carbowaxes and 
polyethylene glycols, which melt at body temperature, yet are solidified at room 
temperature. 

[0511] A polynucleotide, polypeptide, or other modulator, can also be 
introduced into tissues or host cells by other routes, such as viral infection, 
microinjection, or vesicle fusion. For example, expression vectors can be used to 
introduce nucleic acid compositions into a cell as described above. Further, jet 
injection can be used for intramuscular administration (Furth et al., 1992). The DNA 
can be coated onto gold microparticles, and delivered intradermally by a particle 
bombardment device, or "gene gun" as described in the literature (Tang et al, 1992), 
where gold microprojectiles are coated with the DNA, then bombarded into skin cells. 

[05 12] Unit dosage forms for oral or rectal administration such as syrups, 
elixirs, and suspensions can be provided wherein each dosage unit, for example, 
teaspoonful, tablespoonful, tablet, or suppository, contains a predetermined amount of 
the composition containing one or more agents. Similarly, unit dosage forms for 
injection or intravenous administration can comprise the agent(s) in a composition as 
a solution in sterile water, normal saline or another pharmaceutically acceptable 
carrier. 

2. Active Agents (or Modulators) 

[05 1 3] The nucleic acid, polypeptide, and modulator compositions of the 
subject invention find use as therapeutic agents in situations where one wishes to 
modulate an activity of a subject polypeptide in a host, particularly the activity of the 
subject polypeptides, or to provide or inhibit the activity at a particular anatomical 
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site. Thus, the compositions are useful in treating disorders associated with an 
activity of a subject polypeptide. The following provides further details of active 
agents of the .present iiivention. 

a) Antisense Oligonucleotides 

[05 14] In certain embodiments of the invention, the active agent is an agent 
that modulates, and generally decreases or down regulates, the expression of a gene 
encoding a target protein in a host, i.e., antisense molecules. Anti-sense reagents 
include antisense oligonucleotides (ODN), i.e., synthetic ODN having chemical 
modifications from native nucleic acids, or nucleic acid constructs that e3q)ress such 
anti-sense molecules as RNA. The antisense sequence is complementary to the 
mRNA of the targeted gene, and inhibits expression of the targeted gene products. 
Antisense molecules inhibit gene expression through various mechanisms, e.g., by 
reducing the amount of mRNA available for translation, through activation of RNase 
H, or steric hindrance. One or a combination of antisense molecules can be 
administered, where a combination can comprise multiple different sequences. 

[05 1 5] Antisense molecules can be produced by expression of all or a part of 
the target gene sequence in an appropriate vector, where the transcriptional initiation 
is oriented such that an antisense strand is produced as an RNA molecule. 
Alternatively, the antisense molecule is a synthetic oligonucleotide. Antisense 
oligonucleotides can be chemically synthesized by methods known in the art (Wagner 
et al., 1993; Milligan et al., 1993) OUgonucleotides can be chemically modified from 
the native phosphodiester structure to increase their intracellular stability and binding 
affinity, for example, as described in detail above. Antisense oUgonucleotides will 
generally be at least about 7, at least about 12, or at least about 20 nucleotides in 
length, and not more than about 500, not more than about 50, or not more than about 
35 nucleotides in length, where the length is governed by efficiency of inhibition, and 
specificity, including absence of cross-reactivity, and the like. Short oligonucleotides, 
of from about 7 to about 8 bases in length, can be sfrong and selective inhibitors of 
gene expression (Wagner et al., 1996). 

[05 16] A specific region or regions of the endogenous sense strand mRNA 
sequence is chosen to be complemented by the antisense sequence. Selection of a 
specific sequence for the oUgonucleotide can use an empirical method, where several 
candidate sequences are assayed for inhibition of expression of the target gene in an in 
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vitro or animal model. A combination of sequences can also be used, where several 
regions of the mRNA sequence are selected for antisense complementation. 

[05 1 7] As an alternative to anti-sense inhibitors, catalytic iiucleic acid 
compounds, e.g., ribozymes, or anti-sense conjugates can be used to inhibit gene 
expression. Ribozymes can be synthesized in vitro and administered to the patient, or 
can be encoded in an expression vector, from which the ribozyme is synthesized in 
the targeted cell (WO 9523225; Beigelman et al., 1995). Examples of 
oligonucleotides with catalytic activity are described in WO 9506764. . Conjugates of 
anti-sense ODN with a metal complex, e.g., terpyridyl Cu(II), capable of mediating 
mRNA hydrolysis are described in Bashkui et al., 1995. 

b) Interfering RNA 

[05 1 8] In some embodiments, the active agent is an interfering RNA 
(RNAi), including dsRNAi. RNA interference provides a method of silencing 
eukaryotic genes. Double stranded RNA can induce the homology-dependent 
degradation of its cognate mRNA in C. elegans, fungi, plants, Drosophila, and 
mammals (Gaudilliere et al., 2002). Use of RNAi to reduce a level of a particular 
mRNA and/or protein is based on the interfering properties of double-stranded RNA 
derived from the coding regions of a gene. The technique reduces the time between 
identifying an interesting gene sequence and understanding its ftinction, and thus is an 
efficient high-throughput method for dismptuig gene fiinction (ONeil, 2001). RNAi 
can also help identify the biochemical mode of action of a drug and to identify other 
genes encoding products that can respond or uiteract with specific compounds. 

[05 1 9] In one embodiment of the invention, complementary sense and 
antisense RNAs derived from a substantial portion of the subject polynucleotide are 
synthesi2Ed in vitro. The resulting sense and antisense RNAs are annealed in an 
injection buffer, and the double-stranded RNA injected or otherwise infroduced into 
the subject, i.e., in food or by immersion in buffer containing the RNA (Gaudilliere et 
al., 2002; ONeil et al., 2001; W099/32619). hi another embodiment, dsRNA derived 
from a gene of the present invention is generated in vivo by simultaneously expressing 
both sense and antisense RNA from appropriately positioned promoters operably 
linked to coding sequences in both sense and antisense orientations. 

c) Peptides and Modified Peptides 

[0520] In some embodiments of the present invention, the active agent is a 
peptide. Suitable peptides include peptides of from about 3 amino acids to about 50, 
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from about 5 to about 30, or from about 10 to about 25 amino acids in length. In 
some embodiments, a peptide has a sequence of from about 3 amino acids to about 
50, from about 5 to about 30, or from about 10 to about 25 amino acids of 
corresponding naturally-occurring protein. In some embodiments, a peptide exhibits 
one or more of the following activities: inhibits binding of a subject polypeptide to an 
interacting protein or other molecule; inhibits subject polypeptide binding to a second 
polypeptide molecule; inhibits a signal transduction activity of a subject polypeptide; 
inhibits an enzymatip activity of a subject polypeptide; or inhibits a DNA binding 
activity of a subject polypeptide. 

[0521] Peptides can include naturally-occurring and non-naturally occurring 
amino acids. Peptides can comprise D-amino acids, a combination of D- and L-amino 
acids, and various "designer" amino acids (e.g., P-methyl amino acids, Ca-methyl 
amino acids, and Na-methyl amino acids, etc.) to convey special properties. 
Additionally, peptides can be cyclic. Peptides can include non-classical amino acids 
in order to introduce particular conformational motifs. Any known non-classical 
amino acid can be used. Non-classical amino acids include, but are not limited to, 
1 ,2,3,4-tetrahydroisoquinoIine-3-carboxylate; (2S,3S)-methylphenylalanine, (2S,3R)- 
methyl-phenylalanine, (2R,3S)-methyl-phenylalanine and (2R,3R)-methyl- 
phenylalanine; 2-aminotetrahydronaphthalene-2-carboxylic acid; hydroxy- 1,2,3, 4- 
tetrahydroisoquinoline-3-carboxylate; |3-carboline (D and L); HIC (histidine 
isoquinoline carboxylic acid); and HIC (histidine cyclic urea). Amino acid analogs 
and peptidomimetics can be incorporated into a peptide to induce or favor specific 
secondary structures, including, but not limited to, LL-Acp (LL-3-amino-2- 
propemdone-6-carboxyUc acid), a P-tum inducing dipeptide analog; P-sheet inducing 
analogs; P-tum inducing analogs; a-helix inducing analogs; y-tum inducing analogs; 
Gly-Ala turn analogs; amide bond isostere; or tretrazol, and the like. 

[0522] A.peptide can be a depsipeptide, which can be linear or cycUc (Kuisle 
et al., 1999). Linear depsipeptides can comprise rings formed through S-S bridges, or 
through an hydroxy or a mercapto group of an hydroxy-, or mercapto-amino acid and 
the carboxyl group of another amino- or hydroxy-acid but do not comprise rings 
formed only through peptide or ester links derived from hydroxy carboxylic acids. 
Cyclic depsipeptides contain at least one ring formed only through peptide or ester 
links, derived from hydroxy carboxylic acids. 
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[0523] Peptides can be cyclic or bicyclic. For example, the C-terminal 
carboxyl group or a C-terminal ester can be induced to cyclize by internal 
displacement of the -OH or the ester (-OR) of the carboxyl group or ester respectively 
with the N-terminal amino group to form a cyclic peptide. For example, after 
synthesis and cleavage to give the peptide acid, the free acid is converted to an 
activated ester by an appropriate carboxyl group activator such as 
dicyclohexylcarbodiimide (DCC) in solution, for example, in methylene chloride 
(CH2CI2), dimethyl formamide (DMF) mixtures. The cyclic peptide is then formed by 
internal displacement of the activated ester with the N-terminal amine. Internal 
cyclization as opposed to polymerization can be enhanced by use of very dilute 
solutions. Methods for making cyclic peptides are well known in the art. 

[0524] The term "bicyclic" refers to a peptide with two ring closures formed 
by covalent linkages between amino acids. A covalent linkage between two 
nonadjacent amino acids constitutes a ring closure, as does a second covalent linkage 
between a pair of adjacent amino acids which are already linked by a covalent peptide 
linkage. The covalent linkages forming the ring closures can be amide linkages, 
i.e., the linkage formed between a free amino on one amino acid and a free carboxyl 
of a second amino acid, or linkages formed between the side chains or "R" groups of 
amino acids in the peptides. Thus, bicyclic peptides can be "true" bicyclic peptides, 
i.e., peptides cyclized by the formation of a peptide bond between the N-terminus and 
the C-terminus of the peptide, or they can be "depsi-bicyclic" peptides, i.e., peptides 
in which the terminal amino acids are covalently linked through their side chain 
moieties. ' 

[0525] A desamino or descarboxy residue can be incorporated at the terminal 
ends of the peptide, so that there is no terminal amino or carboxyl group, to decrease 
susceptibility to proteases or to restrict conformation. C-terminal fimctional groups 
include amide, amide lower alkyl, amide di (lower alkyl), lower alkoxy, hydroxy, and 
carboxy, and the lower ester derivatives thereof, and the pharmaceutically acceptable 
salts thereof 

[05 26] In addition to the foregoing N-terminal and C-terminal modifications, 
a peptide or peptidomimetic can be modified with or covalently coupled to one or 
more of a variety of hydrophilic polymers to increase solubility and circulation half- 
Ufe of the peptide. Suitable nonproteinaceous hydrophilic polymers for couphng to a 
peptide include, but are not limited to, polyalkylethers as exemplified by polyethylene 
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glycol and polypropylene glycol, polylactic acid, polyglycolic acid, polyoxyalkenes, 
polyvinylalcohol, polj^vinylpyrrolidone, cellulose and cellulose derivatives, dextran, 
and dextran derivatives. Generally, such hydrophilic polymers have an average 
molecular weight ranging from about 500 to about 100,000 daltons, from about 2,000 
to about 40,000 daltons, or from about 5,000 to about 20,000 daltons. The peptide 
can be derivatized with or coupled to such polymers using any of the methods set 
forth in Zallipsky, 1995; Monfardini et al., 1995; U.S. Pat. Nos. 4,640,835; 4,496,689; 
4,301,144; 4,670,417; 4,791,192; 4,179,337, or WO 95/34326. 
d) Antibodies 

[0527] The invention provides antibodies that specifically recognize a 
particular polypeptide. Antibodies are obtained by immunizing a host aninial with 
peptides, polynucleotides encoding polypeptides, or cells, each comprising all or a 
portion of flie target protiein ("immunogen"). Suitable host animals include rodents 
(e.g., mouse, rat, guinea pig, hamster), cattle (e.g., sheep, pig, cow, horse, goat), cat, 
dog, chicken, primate, monkey, and rabbit. The origin of the protein immunogen can 
be any species, including mouse, human, rat, monkey, avian, insect, reptile, or 
crustacean. The host animal will generally be a different species than the 
immunogen, e.g., a human protein used to immunize mice. Methods of antibody 
production are well known in the art (Howard and Betliell, 2000; Harlow et al., 1998; 
Harlow and Lane, 1988). 

[0528] The immunogen can comprise the complete protein, or fragments and 
derivatives thereof, or proteins expressed on cell surfaces. Immunogens comprise all 
or a part of one of the subject proteins, where these amino acids contain post- 
translational modifications, such as glycosylation, found on the native target protein. 
Immunogens comprising protein extracellular domains are produced in a variety of 
ways known in the art, e.g., expression of cloned genes using conventional 
recombinant methods, or isolation from tumor cell culture supematants, etc. The 
immunogen can also be expressed in vivo from a polynucleotide encoding the 
immunogenic peptide introduced into the host animal. 

[0529] Polyclonal antibodies are prepared by conventional techniques. 
These include immunizing the host animal in vivo with the target protein (or 
immunogen) in substantially pure form, for example, comprising less than about 1% 
contaminant. The immunogen can comprise the complete target protein, fragments, 
or derivatives thereof. To increase the immune response of the host animal, the target 
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protein can be combined with an adjuvant; suitable adjuvants include alum, dextran, 
sulfate, large polymeric anions, and oil & water emulsions, e.g., Freund's adjuvant 
(complete or incomplete). The target protein can also be conjugated to synthetic 
carrier proteins or synthetic antigens. The target protein is administered to the host, 
usually intradermally, with an initial dosage followed by one or more, usually at least 
two, additional booster dosages. Following immunization, blood from the host will 
be collected, followed by separation of the serum from blood cells. The 
immunoglobulin present in the resultant antiserum can be further fractionated using 
known methods, such as ammonium salt fractionation, or DEAE chromatography and 
the like. 

[05 30] The method of producing polyclonal antibodies can be varied in some 
embodiments ofthe present invention. For example, instead of using a single 
substantially isolated polypeptide as an immunogen, one may inject a number of 
different immunogens into one animal for simultaneous production of a variety of 
antibodies. In addition to protein immunogens, the immunogens can be nucleic acids 
(e.g., in the form of plasmids or vectors) that encode the proteins, witli facilitating 
agents, such as liposomes, microspheres, etc, or without such agents, such as "naked" 
DNA. 

[053 1] Antibodies can also be prepared using a library approach. Briefly, 
mRNA is extracted from the spleens of immunized animals to isolate antibody- 
encoding sequences. The extracted mRNA may be used to make cDNA libraries. 
Such a cDNA library may be normalized and subtracted in a manner conventional in 
the art, for example, to subtract out cDNA hybridizing to mRNA of non-immunized 
animals. The remaining cDNA may be used to create proteins and for selection of 
antibody molecules or fragments that specifically bind to the immunogen. The cDNA 
clones of interest, or fragments thereof, can be introduced into an in vitro expression 
system to produce the desired antibodies, as described herein. 

[0532] In a further embodiment, polyclonal antibodies can be prepared using 
phage display libraries, conventional in the art. In this method, a collection of 
bacteriophages displaying antibody properties on their surfaces are made to contact 
subject polypeptides, or fragments tliereof Bacteriophages displaying antibody 
properties that specifically recognize the subject polypeptides are selected, amplified, 
for example, in E. coli, and harvested. Such a method typically produces single chain 
antibodies 
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[0533] Monoclonal antibodies are also produced by conventional techniques, 
such as fusing an antibody-producing plasma cell with an immortal cell to produce 
hybridomas. Suitable animals will be used, e.g., to raise antibodies against a mouse 
polypeptide of the invention, the host animal will generally be a hamster, guinea pig, 
goat, cliicken, or rabbit, and the like. Generally, the spleen and/or lymph nodes of an 
immunized host animal provide the source of plasma cells, which are immortahzed by 
fusion with myeloma cells to produce hybridoma cells. Culture supematants from 
individual hybridoma are screened using standard techniques to identify clones 
producing antibodies with the desired specificity. The antibody can be purified from 
the hybridoma cell supematants or from ascites fluid present in the host by 
conventional techniques, e.g., affinity chromatography using antigen, e.g., the subject 
protein, bound to an insoluble support, i.e., protein A sepharose, etc. 

[0534] The antibody can be produced as a single chain, instead of the normal 
multimeric structure of the immunoglobulin molecule. Single chain antibodies have 
been previously described (i.e., Jost et al., 1994). DNA sequences encoding parts of 
the immunoglobulin, for example, the variable region of the heavy chain and the 
variable region of the light chain are li gated to a spacer, such as one encoding at least 
about four small neutral amino acids, i.e., glycine or serine. The protein encoded by 
this fusion allows the assembly of a fimctional variable region that retains the 
specificity and affinity of the original antibody. 

[0535] The invention also provides intrabodies that are intracellulariy 
expressed single-chain antibody molecules designed to specifically bind and 
inactivate target molecules inside cells. Intrabodies have been used in cell assays and 
in whole organisms (Chen et al., 1994; Hassanzadeh et al., 1998). Inducible 
expression vectors can be constructed with intrabodies that react specifically with a 
protein of the invention. These vectors can be introduced into host cells and model 
organisms. 

[05 36] The invention also provides "artificial" antibodies, e.g., antibodies 

and antibody fragments produced and selected in vitro. In some embodiments, these 
antibodies are displayed on the surface of a bacteriophage or other viral particle, as 
described above. In other embodiments, artificial antibodies are present as fusion 
proteins with a viral or bacteriophage structural protein, including, but not Hmited to. 
Ml 3 gene III protein. Methods of producing such artificial antibodies are well laiown 
in the art (U.S. Patent Nos. 5,516,637; 5,223,409; 5,658,727; 5,667,988; 5,498,538; 
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5,403,484; 5,571,698; and 5,625,033). The artificial antibodies, selected for example, 
on the basis of phage binding to selected antigens, can be fused to a Fc fragment of an 
immunoglobulin for use as a therapeutic, as described, for example, in US 5,1 16,964 
or WO 99/61630. Antibodies of the invention can be used to modulate biological 
activity of cells, either directly or indirectly. A subject antibody can modulate the 
activity of a target cell, with which it has primary interaction, or it can modulate the 
activity of other cells by exerting secondary effects, i.e., when the primary targets 
interact or communicate with other cells. The antibodies of the invention can be 
administered to mammals, and the present invention includes such administration, 
particularly for therapeutic and/or diagnostic purposes in humans. 

[0537] Antibodies may be administered by injection systemically, such as by 
intravenous injection; or by injection or application to the relevant site, such as by 
direct injection into a tumor, or direct application to the site when the site is e}^osed 
in surgery; or by topical application, such as if the disorder is on the skin, for 
example. 

[0538] For in vivo use, particularly for injection into humans, in some 
embodiments it is desirable to decrease the antigenicity of the antibody. An immune 
response of a recipient against tlie antibody may potentially decrease the period of 
time that the therapy is effective. Methods of humanizing antibodies are known in the 
art. The humanized antibody can be tlie product of an animal having transgenic 
human immunoglobulin genes, e.g., constant region genes (e.g., Grosveld and Kohas, 
1992; Murphy and Carter, 1993; Pinkert, 1994; and Intemational Patent Applications 
WO 90/10077 and WO 90/04036). Alternatively, the antibody of interest can be 
engineered by recombinant DNA techniques to substitute the CHI, CH2, CH3, hinge 
domains, and/or the framework domain with the corresponding human sequence (see, 
e.g., WO 92/02190). Both polyclonal and monoclonal antibodies made in non-human 
animals may be "humanized" before administration to human subjects. 

[0539] Chimeric immunoglobulin genes constructed with immunoglobulin 
cDNA are known in the art (Liu et al. 19S7a; Liu et al. 1987b). Messenger RNA is 
isolated from a hybridoma or other cell producing the antibody and used to produce 
cDNA. The cDNA of interest can be amplified by the polymerase chain reaction 
using specific primers (U.S. Patent nos. 4,683,195 and 4,683,202). Altematively, a 
library is made and screened to isolate the sequence of interest. The DNA sequence 
encoding the variable region of the antibody is then fused to human constant region 
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sequences. The sequences of human constant regions genes are known in the art 
(Kabat et al., 1 991). Human C region genes are readily available from known clones. 
The choice of isotype will be guided by the desired effector functions, such as 
complement fixation, or antibody-dependent cellular cytotoxicity. IgGl, IgG3 and 
IgG4 isotypes, and either of the kappa or lambda human light chain constant regions 
can be used. The chimeric, humanized antibody is then expressed by conventional 
methods. 

[0540] Consensus sequences of heavy ("H") and light ("L") J .regioiis can be 
used to design oligonucleotides for use as primers to introduce useful restriction sites 
into the J region for subsequent linkage of V region segments to human C region 
segments. C region cDNA can be modified by site directed mutagenesis to place a 
restriction site at the analogous position in the human sequence. 

[0541 ] A convenient expression vector for producing antibodies is one that 
encodes a functionally complete human CH or CL immunoglobulin sequence, with 
appropriate restriction sites engineered so that any VH or VL sequence can be easily 
inserted and expressed, such as plasmids, retroviruses, YACs, or EBV derived 
episomes, and tlie like. In such vectors, splicing usually occurs between the splice 
donor site in the inserted J region and the splice acceptor site preceding the human C 
region, and also at the splice regions that occur within the human CH exons. 
Polyadenylation and transcription termination occur at native chromosomal sites 
downstream of the coding regions. The resulting chimeric antibody can be joined to 
any strong promoter, including retroviral LTRs, e.g., SV-40 early promoter, 
(Okayama, et al. 1983), Rous sarcoma virus LTR (Gorman et al. 1982), and Moloney 
murine leulcemia vims LTR (Grosschedl et al. 1985), or native immunoglobulin 
promoters. 

[0542] In yet other embodiments, the antibodies can be fiilly human 
antibodies. For example, xenogenic antibodies, which are produced in animals that 
are transgenic for human antibody genes, can be employed. By xenogenic human 
antibodies is meant antibodies that are fully human antibodies, with the exception that 
they are produced in a non-human host that has been genetically engineered to 
express human antibodies, (e.g., WO 98/50433; WO 98,24893 and WO 99/53049). 

[0543] Antibody fragments, such as Fv, F(ab02 and Fab can be prepared by 
cleavage of the intact protein, e.g., by protease or chemical cleavage. These 
fragments can include heavy and light chain variable regions. Alternatively, a 
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truncated gene can be designed, e.g., a chimeric gene encoding a portion of the F(ab')2 
fragment that includes DNA sequences encoding the CHI domain and hinge region of 
the H chain, followed by a translational stop codon. The antibodies of the present 
invention may be administered alone or in combination with other molecules for use 
as a therapeutic, for example, by linking the antibody to cytotoxic agent, as discussed 
above, or to a radioactive molecule. Radioactive antibodies that are specific to a 
cancer cell, disease cell, or virus-infected cell may be able to deliver a sufficient dose 
of radioactivity to kill such cancer cell, disease cell, or vims-infected cell. The 
antibodies of the present invention can also be used in assays for detection of the 
subject polypeptides, hi some embodiments, the assay is a binding assay that detects 
binding of a polypeptide with an antibody specific for the polypeptide; the subject 
polypeptide or antibody can be immobilized, while the subject polypeptide and/or 
antibody can be detectably-labeled. For example, the antibody can be directly labeled 
or detected with a labeled secondary antibody. That is, suitable, detectable labels for 
antibodies include direct labels, which label the antibody to the protein of interest, and 
indirect labels, which label an antibody that recogni2ES the antibody to the protein of 
interest. 

[0544] These labels include radioisotopes, including, but not limited to ^'^Cu, 
^'Cu, '°Y, ''X ^"Cs, ^'^Re, '"At, '>'Bi, ''^Bi, ''^Ra, '^^Am, and '"^'Cm; 
enzymes having detectable products (e.g., luciferase, (3-galactosidase, and the like); 
fluorescers and fluorescent labels, e.g., as provided herein; fluorescence emitting 
metals, e.g., '"Eu, or others of the lanthanide series, attached to the antibody through 
metal chelating groups such as EDTA; chemiluminescent compounds, e.g., luminol, 
isoluminol, or acridinium salts; and bioluminescent compounds, e.g., luciferin, or 
aequorin (green fluorescent protein), specific binding molecules, e.g., magnetic 
particles, microspheres, nanospheres, and the like. 

[0545] Alternatively, specific-binding pairs may be used, involving, e.g., a 
second stage antibody or reagent that is detectably-labeled and that can amplify the 
signal. For example, a primary antibody can be conjugated to biotin, and horseradish 
peroxidase-conjugated strepavidin added as a second stage reagent. Digoxin and 
antidigoxin provide another such pair. In other embodiments, the secondary antibody 
can be conjugated to an enzyme such as peroxidase in combination with a substrate 
that undergoes a color change in the presence of the peroxidase. The absence or 
presence of antibody binding can be determined by various methods, including flow 
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cytometry of dissociated cells, microscopy, radiography, or scintillation counting. 
Such reagents and their methods of use are well known in the art. 

e) Peptide Aptamers 

[0546] Another suitable agent for modulating an activity of a subject 
polypeptide is a peptide aptamer. Peptide aptamers are peptides or small polypeptides 
that act as dominant inhibitors of protein function. Peptide aptamers specifically bind 
to target proteins, blocking their functional ability (Kolonin and Finley, 1998). Due to 
the highly selective nature of peptide aptamers, they can be used not only to target a 
specific protein, but also to target specific functions of a given protein (e.g., a 
signaling function). Further, peptide aptamers can be expressed in a controlled 
fashion by use of promoters which regulate expression in a temporal, spatial or 
inducible manner. Peptide aptamers act dominantly, therefore, they can be used to 
analyze proteins for which loss-of-function mutants are not available. 

[0547] Peptide aptamers that bind with high afBnity and specificity to a 
target protein can be isolated by a variety of techniques known in the art. Peptide 
aptamers can be isolated from random peptide libraries by yeast two-hybrid screens 
(Xu et al., 1997). They can also be isolated from phage libraries (Hoogenboom et al, 
1998) or chemically generated peptides/libraries. 

Therapeutic Applications: Methods of Use 

[0548] The instant invention provides various therapeutic methods. In some 
embodiments, methods of modulating, including increasing and inhibiting, a 
biological activity of a subject protein are provided. In some embodiments, methods 
of modulating an enzymatic activity of a subject protein are provided. In some 
embodiments, methods of increasing the level of enzymatically active subject protein 
are provided, while in some embodiments, methods of decreasing a level of 
enzymatically active subject protein are provided. 

[0549] In some embodiments, methods of modulating enzymatic activity of a 
subject protein are provided. In other embodiments, methods of modulating a signal 
transduction activity of a subject protein are provided. In further embodiments, 
methods of modulating interaction of a subject protein with another, interacting 
protein or other macromolecule (e.g., DNA, carbohydrate, lipid) are provided. In 
further embodiments, methods of modulating transport activity of a subject protein are 
provided. In further embodiments, methods of modulating phopholipase activity of a 
subject protein are provided. In further embodiments, methods of modulating 
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poljraerase activity of a subject protein are provided. In further embodiments, 
methods of modulating nuclease activity of a subject protein are provided. 

[0550] As mentioned above, an effective amount of the active agent (e.g., 
small molecule, antibody specific for a subject polypeptide, a subject polypeptide, or 
a subject polynucleotide) is administered to the host, where "effective amount" means 
a dosage sufficient to produce a desired effect or result. In some embodiments, the 
desired result is at least a reduction in a given biological activity of a subject 
polypeptide, as comp3red to a control, for example, a decreased level of enzymatically 
active subject protein in the individual, or in a localized anatomical site in the 
individual. In iurther embodiments, the desired result is at least an increase in a 
biological activity of a subject polypeptide as compared to a control, for example an 
increased level of enzymatically active subject protein in the individual, or in a 
localized anatomical site in the individual. 

[05 5 1 ] Typically, the compositions of the instant invention will contain from 
less than about 1% to about 95% of the active ingredient, about 10% to about 50%. 
Generally, between about 100 mg and about 500 mg will be administered to a child 
and between about 500 mg and about 5 grams will be administered to an adult. 

[0552] Other effective dosages can be readily detennined by one of ordinary 
skill in the art through routine trials establishing dose response cur\'es, for example, 
the amount of agent necessary to increase a level of active subject polypeptide can be 
calculated from in vitro ej^erimentation. Those of skill will readily appreciate that 
dose levels can vary as a function of the specific compound, the severity of the 
symptoms, and the susceptibility of the subject to side effects, and preferred dosages 
for a given compound are readily determinable by those of skill in the art by a variety 
of means. For example, in order to calculate the polypeptide, polynucleotide, or 
modulator dose, those skilled in the art can use readily available information with 
respect to the amount necessary to have the desired effect, depending i^on the 
particular agent used. 

[05 53] The active agent(s) can be administered to the host via any 
convenient means capable of resulting in the desired result. Administration is 
generally by injection and often by injection to a locaHzed area. The frequency of 
administration will be determined by the care given based on patient responsiveness. 
For example, the agents may be administered daily, weekly, or as conventionally 
determined appropriate. 
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[05 54] A variety of hosts are treatable according to the subj ect methods. The 
host, or patient, may be from any animal species, and will generally be mammalian, 
e.g., primate sp., e.g., monkeys, chimpanzees, and particularly humans; rodents, 
including mice, rats and hamsters, guinea pig; rabbits; cattle, including equines, 
bovines, pig, sheep, goat, canines; felines; etc. Animal models are of interest for 
experimental investigations, providing a model for treatment of human disease. 

Proliferative Conditions 

[0555] In some embodiments, a protein of the present invention is involved 
in the control of cell proliferation, and an agent of the invention inhibits undesirable 
cell proliferation. Such agents are useM for treating disorders that involve abnormal 
cell proliferation, including, but not limited to, cancer, psoriasis, and scleroderma. 
Whether a particular agent and/or therapeutic regimen of the invention is effective in 
reducing unwanted cellular proliferation, e.g., in the context of treating cancer, can be 
determined using standard methods. For example, the number of cancer cells in a 
biological sample (e.g., blood, a biopsy sample, and the like), can be determined. The 
tumor mass can be determined using standard radiological or biochemical methods. 

[0556] Tumors that can be treated using the methods of the instant invention 
include carcinomas, e.g., colorectal, prostate, breast, bone, kidney, skin, melanoma, 
ductal, endometrial, stomach or other organ of the gastrointestinal tract, pancreatic, 
mesothelioma, dysplastic oral mucosa, invasive oral cancer, non-small cell lung 
carcinoma ("NSCL"), transitional and squamous cell urinary carcinoma; brain cancer 
and neurological malignaiicies, e.g., neuroblastoma, glioblastoma, astrocytoma, and 
gliomas; lymphomas and leukemias such as myeloid leukemia, myelogenous 
leukemia, hematological malignancies, such as childhood acute leulcemia, non- 
HodgMn'fe lymphomas, chronic lymphocytic leukemia, malignant cutaneous T-cell 
lymphoma, mycosis ftmgoides, non-MF cutaneous T-cell lymphoma, lymphomatoid 
papulosis, T-ceU rich cutaneous lymphoid hyperplasia, bullous pemphigoid, discoid 
lupus erythematosus, Uchen planus, and human follicular lymphoma; cancers of the 
reproductive system, e.g., cervical and ovarian cancers and testicular cancers; liver 
cancers including hepatocellular carcinoma ("HCC") and timiors of the biliary duct; 
multiple myelomas; tumors of the esophageal tract; other lung cancers and tumors 
including small cell and clear cell; Hodgkin's lymphomas; adenocarcinoma; and 
sarcomas, including soft tissue sarcomas. 
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Imrnunotherapeutic Approaches to Proliferative Conditions 
[0557] The polynucleotides, polypeptides, and modulators of the present 
invention find use in immunotherapy of hyperproUferative disorders, including 
cancer, neoplastic, and paraneoplastic disorders. That is, the subject molecules can 
correspond to tumor antigens, of which 1770 have been identified to date (Yu and 
Restifo, 2002). Imrnunotherapeutic approaches include passive immunotherapy and 
vaccine therapy and can accomplish both generic and antigen-specific cancer 
immunotherapy. 

[055 8] Passive immunity approaches involve antibodies of the invention that 
are directed toward specific tumor-associated antigens. Such antibodies can eradicate 
systemic tumors at multiple sites, without eradicating normal cells. In some 
embodiments, the antibodies are combined with radioactive components, as provided 
above, for example, combining the antibody^ ability to specifically target tumors with 
the added lethality of the radioisotope to the tumor DNA. 

[05 59] Useful antibodies comprise a discrete epitope or a combination of 
nested epitopes, i.e., a 10-mer epitope and associated peptide multimers incorporating 
all potential 8-mers and 9-mers, or overlapping epitopes (Dutoit et al., 2002). Thus a 
single antibody can interact with one or more epitopes. Further, the antibody can be 
used alone or in combination with different antibodies, that all recognize either a 
single or multiple epitopes. 

[0560] Neutralizing antibodies can provide therapy for cancer and 
proliferative disorders. Neutralizing antibodies that specifically recognize a secreted 
protein or peptide of the invention can bind to the secreted protein or peptide, e.g., in 
a bodily fluid or the extracellular space, thereby modulating the biological activity of 
the secreted protein or peptide. For example, neutralizing antibodies specific for 
secreted proteins or peptides that play a role in stimulating the growth of cancer cells 
can be usefijl in modulating the growth of cancer cells. Similarly, neutralizing 
antibodies specific for secreted proteins or peptides that play a role in the 
differentiation of cancer cells can be usefiil m modulating the differentiation of cancer 
cells. 

[0561] Vaccine therapy involves the use of polynucleotides, polypeptides, or 
agents of the invention as immunogens for tumor antigens (Machiels et al, 2002). 
For example, peptide-based vaccines of the invention include unmodified subject 
polypeptides, fragments thereof, and MHC class I and class H-restricted peptide 
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(Knutson et al., 2001), comprising, for example, the disclosed sequences with 
universal, nonspecific MHC class ll-restricted epitopes. Peptide-based vaccines 
comprising a tumor antigen can be given directly, either alone or in conjunction with 
other molecules. The vaccines can also be delivered orally by producing the antigens 
in transgenic plants that can be subsequently ingested (U.S. Patent No. 6,395,964). 

[0562] In some embodiments, antibodies themselves can be used as antigens 
in anti-idiotype vaccines. That is, administering an antibody to a tumor antigen 
stimulates B cells to make antibodies to that antibody, which in turn recognize the 
tumor cells 

[0563 ] Nucleic acid-based vaccines can deliver tumor antigens as 
polynucleotide constructs encoding the antigen. Vaccines comprising genetic 
material, such as DNA or RNA, can be given directly, either alone or in conjunction 
with other molecules. Administration of a vaccine expressing a molecule of the 

invention, e.g., as plasmid DNA, leads to persistent expression and release of the 
therapeutic immunogen over a period of time, helping to control unwanted tumor 
growth. 

[0564] In some embodiments, nucleic acid-based vaccines encode subject 
antibodies. In such embodiments, the vaccines (e.g., DNA vaccines) can include 
post-transcriptional regulatory elements, such as the post-transcriptional regulatory 
acting RNA element (WPRE) derived from Woodchuck Hepatitis Virus. These post- 
transcriptional regulatory elements can be used to target the antibody, or a fusion 
protein comprising the antibody and a co-stimulatory molecule, to the tumor 
microenvironment (Peril et al., 2003). 

[0565] Besides stimulating anti-tumor immune responses by inducing 
humoral responses, vaccines of the invention can also induce cellular responses, 
including stimulating T-cells that recognize and kill tumor cells directly. For 
example, nucleotide-based vaccines of the invention encoding tumor antigens can be 
used to activate the CD8"^ cytotoxic T lymphocyte arm of the immune system. 

[0566] In some embodimeats, the vaccines activate T-cells directly, and in 
others they enlist antigen-presenting cells to activate T-cells. Killer T-cells are 
primed, in part, by interacting with antigen-presenting cells, i.e., dendritic cells. In 
some embodiments, plasmids comprising the nucleic acid molecules of the invention 
enter antigen-presenting cells, which in turn display the encoded tumor-antigens that 
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contribute to killer T-cell activation. Again, the tumor antigens can be delivered as 
plasmid DNA constructs, either alone or with other molecules. 

[0567] In further embodiments, RNA can be used. For example, dendritic 
cells can be transfected with RNA encoding tumor antigens (Heiser et al, 2002; 
Mitchell andNair, 2000). This approach overcomes the limitations of obtaining 
sufficient quantities of tumor material, extending therapy to patients otherwise 
excluded from clinical trials. For example, a subject RNA molecule isolated from 
tumors can be amplified using RT-PCR. In some embodiments, the RNA molecule of 
the invention is directly isolated from tumors and transfected into dendritic cells with 
no intervening cloning steps. 

[0568] In some embodiments the molecules of the invention are altered such 
that the peptide antigens are more highly antigenic than in their native state. These 
embodiments address the need in the art to overcome the poor in vivo immunogenicity 
of most tumor antigens by enhancing tumor antigen immunogenicity via modification 
of epitope sequences (Yu and Restifo, 2002). 

[0569] Another recognized problem of cancer vaccines is the presence of 
preexisting neutralizing antibodies. Some embodiments of the present invention 
overcome this problem by using viral vectors from non-mammalian natural hosts, i.e., 
avian pox viruses. Alternative embodiments that also circimivent preexisting 
neutralizing antibodies include genetically engineered influenza viruses, and the use 
of '.'naked" plasmid DNA vaccines that contain DNA with no associated protein. (Yu 
and Restifo, 2002). 

[0570] All of the immunogenic methods of the invention can be used alone 
or in combination with other conventional or unconventional therapies. For example, 
immunogenic molecules can be combined with other molecules that have a variety of 
antiproliferative effects, or with additional substances that help stimulate the immune 
response, i.e., adjuvants or cytokines. 

[0571] For example, in some embodiments, nucleic acid vaccines encode an 
alphaviral replicase enzyme, in addition to tumor antigens. This recently discovered 
approach to vaccine therapy successfully combines therapeutic antigen production 
with the induction of the apoptotic death of the tumor cell (Yu and Restifo, 2002). 

[0572] In certain other embodiments, a DNA or RNA vaccine of the present 
invention can also be directed against the production of blood vessels in the vicinity 
of the tumor, a process called antiangiogenesis, thereby depriving the cancer cells of 
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nutrients. For example, the antiangiogenic molecules angiostatin (a fragment of 
plasminogen), endostatin (a fragment of collagen XVni), interferon-y, interferon- 
Y inducible protein 10, interleukin 12, thrombospondin, platelet factor-4, calreticulin, 
or its protein fragment vasostatin can be used to treat tumors by suppressing 
neovascularization and thereby inhibiting growth (Cheng et al., 2001). The 
antiangiogenesis approach can be used alone, or in conjunction with molecules 
directed to tumor antigens. 

[0573] Furthermore, adjuvants can be used in conjunction with the 
antibodies and vaccines disclosed herein. Adjuvants help boost the general immune 
response, for example, concentrating immune cells to the specific area where they are 
needed. They can be added to a cancer vaccine itself or administered separately, and 
in some embodiments, a viral vector can be engineered to display adjuvant proteins on 
its surface. 

[05 74] Cytokines can also be used to help stimulate immune response. 
Cytokines act as chemical messengers, recruiting immune cells that help the killer T- 
cells to the site of attack. An example of a cytokine is granulocj'te-macrophage 
colony-stimulating factor (GM-CSF), which stimulates the proliferation of antigen- 
presenting cells, thus boosting an organism's response to a cancer vaccine. As with 
adjuvants, cytokines can be used in conjunction with the antibodies and vaccines 
disclosed herein. For example, they can be incorporated into the antigen-encoding 
plasmid or introduced via a separate plasmid, and in some embodiments, a viral 
vector can be engineered to display cytokines on its surface. 

Inflammation and Immunity 

[0575] In other embodiments, e.g., where the subject polypeptide is involved 
in modulating inflammation or immune fiinction, the invention provides agents for 
treating such inflammation or immune disorders. Disease states that are treatable 
using formulations of the invention include various types of arthritis such as 
rheumatoid arthritis and osteoarthritis, autoimmune thyroiditis, various chronic 
inflammatory conditions of the skin, such as psoriasis, the intestine, such as 
inflammatory bowel disease (IBD), insulin-dependent diabetes, autoimmune diseases 
such as multiple sclerosis (MS), intestinal immune disorders and systemic lupus 
erythematosis (SLE), allergic diseases, transplant rejections, adult respiratory disfress 
syndrome, atherosclerosis, ischemic diseases due to closure of the peripheral 
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vasculature, cardiac vasculature, and vasculature in the central nervous system (CNS). 
After reading the present disclosure, those skilled in the art will recognize other 
disease states and/or symptoms which might be treated and/or mitigated by the 
administration of formulations of the present invention. 

[0576] Neutralizing antibodies can provide immunosuppressive therapy for 
inflammatory and autoimmune disorders. Neutralizing antibodies can be used to treat 
disorders such as, for example, multiple sclerosis, rheumatoid arthritis, inflammatory 
bowel disease, transplant rejection, and psoriasis. Neutralizing antibodies that 
specifically recognize a secreted protein or peptide of the invention can bind to the 
secreted protein or peptide, e.g., in a bodily fluid or the extracellular space, thereby 
modulating the biological activity ofthe secreted protein or peptide. For example, 
neutralizing antibodies specific for secreted proteins or peptides that play a role in 
activating immune cells are useful as immunosuppressants. 

Disorders Related to Cell Death 

[0577] Where a polypeptide of the invention is involved in modulating cell 
death, an agent ofthe invention is useful for treating conditions or disorders relating 
to cell death (e.g., DNA damage, cell death, apoptosis). Cell death-related indications 
that can be treated using the methods of the invention to reduce cell death in a 
eukaryotic cell, include, but are not limited to, cell death associated with Alzheimer's 
disease, Parkinson's disease, rheumatoid arthritis, autoimmune thyroiditis, septic 
shock, sepsis, stroke, central nervous systeni inflammation, intestinal inflammation, 
osteoporosis, ischemia, reperfusion injury, cardiac muscle cell death associated with 
cardiovascular disease, polycystic kidney disease, cell death of endothelial cells in 
cardiovascular disease, degenerative liver disease, multiple sclerosis, amyotropic 
lateral sclerosis, cerebellar degeneration, ischemic injury, cerebral infarction, 
myocardial infarction, acquired immunodeficiency syndrome (AIDS), 
myelckiysplastic syndromes, aplastic anemia, male pattern baldness, and head injury 
damage. Also included are conditions in which DNA damage 1o a cell is induced by 
external conditions, including but not limited to irradiation, radiomimetic drugs, 
hypoxic injury, chemical injury, and damage by free radicals. Also included are any 
hypoxic or anoxic conditions, e.g., conditions relating to or resulting from ischemia, 
myocardial infarction, cerebral infarction, stroke, bypass heart surgery, organ 
transplantation, and neuronal damage, etc. 
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[0578] DNA damage can be detected using any kaown method, including, 
but not limited to, a Comet assay (commercially available from Trevigen, Inc.), which 
is based on alkaline lysis of labile DNA at sites of damage; and immunological assays 
using antibodies specific for aberrant DNA structures, e.g., 8-OHdG. 

[0579] Cell death can be meiasured using any known method, and is 
generally measured using any of a variety of known methods for measuring cell 
viability. Such assays are generally based on entry into the cell of a detectable 
compound (or a compound that becomes detectable upon interacting with, or being 
acted on by, an intracellular component) that would normally be excluded from a 
normal, living cell by its structurally and functionally intact cell membrane. Such 
compounds include substrates for intracellular enzymes, including, but not Umited to, 
a fluorescent substrate for esterase; dyes that are excluded from living cells, including, 
but not limited to, trypan blue; and DNA-binding compounds, including, but not 
limited to, an ethidium compound such as ethidium bromide and ethidium 
homodimer, and propidium iodide. 

[0580] Apoptosis, or programmed cell death, is a regulated process leading 
to cell death via a series of well-defined morphological changes. Programmed cell 
death provides a balance for cell growth and multiplication, eliminating unnecessary 
cells. The default state of the cell is to remain alive. A cell enters tlie apoptotic 
pathway when an essential factor is removed from the extracellular environment or 
when an internal signal is activated. Genes and proteins of the invention that suppress 
the growth of tumors by activating cell death provide the basis for treatment strategies 
for hyperproliferative disorders and conditions. 

[0581] Apoptosis can be assayed using any known method. Assays can be 
conducted on cell populations or an individual ceU, and include morphological assays 
and biochemical assays. A non-limiting example of a method of determining the level 
of apoptosis in a cell population is TUNEL (TdT-mediated dUTP nick-end labeUng) 
labeling of the 3 -OH firee end of DNA fragments produced during apoptosis 
(Gavrieli et al., 1992). The TUNEL method consists of catalytically adding a 
nucleotide, which has been conjugated to a chromogen system, a fluorescent tag, or 
the 3 -OH end of the 180-bp (base pair) oUgomer DNA fragments, in order to detect 
the fragments. The presence of a DNA ladder of 180-bp oligomers is indicative of 
apoptosis. Procedures to detect cell death based on the TUNEL method are available 
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commercially, e.g., jfrom Boehringer Mannheim (Cell Death Kit) and Oncor (Apoptag 
Plus). 

[0582] Another marker that is currently available is annexin, sold under the 
trademark APOPTEST™. This marker is used in the "Apoptosis Detection Kit," 
which is also commercially available, e.g., from R&D Systems. During apoptosis, a 
cell membrane's phospholipid asymmetry changes such that the phospholipids are 
ejq)Osed on the outer membrane. Annexins are a homologous group of proteins that 
bind phospholipids i^i the presence of calcium. A second reagent, propidium iodide 
(PI), is a DNA binding fluorochrome. When a cell population is exposed to both 
reagents, apoptotic cells stain positive for annexin and negative for PI, necrotic cells 
stain positive for both, live cells stain negative for both. Other methods of testing for 
apoptosis are known in the art and can be used, including, e.g., the method disclosed 
in U.S. Patent No. 6,048,703 . 

Other Pathological Conditions 

[0583] Other pathological conditions that can be treated using the methods 
of the instant invention include disorders of hematopoeisis, cell differentiation, 
disorders of ion channels, e.g., cystic fibrosis, and tissue or organ hypertrophy, 
bacterial disorders, viral disorders, including acquired immunodeficiency syndrome 
(AIDS), angiogenesis, metastasis, metabolic disorders such as diabetes and obesity, 
cardiovascular disorders such as congestive heart failure and stroke, male erectile 
dysfunction, and the disorders described throughout the specification. 
Investigative Applications 

[0584] The subject nucleic acid compositions find use in a variety of 
different investigative applications. Applications of interest include identifying 
genomic DNA sequence using molecules of the invention, identifying homologs of 
molecules of the invention, creating a source of novel promoter elements, identifying 
expression regulatory fectors, creating a source of probes and primers for 
hybridization applications, identifying expression patterns in biological specimens; 
preparing cell or animal models to investigate the ftinction of the molecules of the 
invention, and preparing in vitro models to investigate the function of the molecules 
of the invention. 

Genomic DNA Sequences 

[0585] Human genomic polynucleotide sequences corresponding to 
molecules of the present mvention are identified by conventional means, such as, for 
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example, by probing a genomic DNA library with all or a portion of the 

polynucleotide sequences. 

Homologs 

[0586] Homologs are identified by any of a number of methods. By using 
probes, particularly labeled probes of DNA sequences, one can isolate homologous or 
related genes, as described in detail above. Briefly, a fragment of the provided cDNA 
can be used as a hybridization probe against a cDNA library from the target organism 
of interest, under various stringency conditions, e.g., low stringency conditions. The 
probe can be a large fragment, or one or more short degenerate primers, and is 
typically labeled. Sequence identity can be determined by hybridization under 
stringent conditions, as described in detail above. Nucleic acids having a region of 
substantial identity or sequence similarity to the provided nucleic acid sequences, for 
example allelic variants, related genes, or genetically altered versions of the gene, 
bind to the provided sequences under less stringent hybridization conditions. 

Promoter Elements and Expression Regulatory Factors 

[0587] The sequence of the 5' flanking region can be utilized as promoter 
elements, including enhancer binding sites that provide for tissue-specific expression 
and developmental regulation in tissues where the subject genes are expressed, 
providing promoters that mimic the native pattern of expression. Naturally occurring 
polymorphisms in the promoter region are useful for determining natural variations in 
expression, particularly those that may be associated with disease. Promoters or 
enhancers that regulate the transcription of the polynucleotides of the present 
invention are obtainable by use of PGR techniques using human tissues, and one or 
more of the present primers. 

[0588] Alternatively, mutations can be introduced into the promoter region 
to determine the effect of altering expression in experimentally defined systems. 
Methods for the identification of specific DNA motifs mvolved in the binding of 
transcriptional factors are known m the art, for example sequence similarity to known 
binding motifs, and gel retardation studies (Blackwell et al., 1995; Mortlock et al., 
1996; Joulin and Richard-Foy, 1995). 

[0589] The regulatory sequences can be used to identify cis acting sequeiices 
required for transcriptional or translational regulation of expression, especially in 
different tissues or stages of development, and to identify cis acting sequences and 
irflTM-acting factors that regulate or mediate expression. Such transcription or 
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translational control regions can be operably linked to a gene in order to promote 
expression of wild type genes or of proteins of interest in cultured cells, embryonic, 
fetal or adult tissues, and for gene.therapy (Hooper, 1 993). 
Primers and Probes 

[0590] Small DNA fragments are useful as primers for reactions that involve 
nucleic acid hybridization, as described in detail above. Briefly, pairs of primers will 
be used in amplification reactions, such as PGR. Amplification primers hybridize to 
complementary strands of DNA, for example, under stringent conditions, and will 
prime towards each other. In some embodiments a pair of primers will generate an 
amplification product of at least about 50 nt, or at least about 100 nt. Algorithms for 
the selection of primer sequences are generally known, and are available in 
commercial software packages. 

[0591] The nucleotides can also be used as probes to identify genomic DNA 
or gene expression in a biological specimen, as described above and as is well 
established in the art. Briefly, DNA or mRNA is isolated fi-om a cell sample. 
Detection of mRNA hybridizing to the subject sequence is indicative of gene 
expression in the sample. The mRNA can be ampUfied by RT-PCR, using reverse 
transcriptase to form a complementary DNA strand, followed by polymerase chain 
reaction amphfication using primers specific for the subject DNA sequences. 
Ahematively, the mRNA sample is separated by gel electrophoresis, transferred to a 
suitable support, e.g., nitrocellulose, nylon, etc., and then probed with a fragment of 
the subject nucleotides as a probe. Other techniques, such as oligonucleotide ligation 
assays, in situ hybridizations, and hybridization to probes arrayed on a solid chip may 
also find use. 

Targeted Mutations for In Vivo and In yitro Models 

[0592] The sequence of a gene according to the subject invention, including 
flanking promoter regions and coding regions, can be mutated in various ways known 
in the art to generate targeted changes, i.e., changes in promoter strength, or sequence 
of the encoded protein, etc. The DNA sequence or protein product of such a mutation 
will usually be substantially similar to the sequences provided herein. The sequence 
changes can be substitutions, insertions, deletions, or a combination thereof 
Deletions can further include larger changes, such as deletions of a domain or exon. 

[0593] Techniques for in vitro mutagenesis of cloned genes are known. 
Examples of protocols for site specific mutagenesis may be found in Gustin et al., 
165 



wo 2005/005597 



PCT/US2003/027106 



1993; Baxany 1985; Colicelli et al., 1985; Prentki et al., 1984. Methods for site 
specific mutagenesis can be found in Sambrook et al., 1989 (pp. 15.3-15.108); Weiner 
et al, 1993; Sayers et al. 1992; Jones and Winistorfer; Barton et al., 1990; Marotti and 
Tomich 1989; and Zhu, 1989. Such mutated genes can be used to study structure- 
function relationships of the subject proteins, or to alter properties of the protein that 
affect its fiinction or regulation. Other modifications of interest include epitope 
tagging, e.g., with hemagglutinin (HA), FLAG, or c-myc. For studies of subcellular 
localization, fluorescent fusion proteins can be used. 

[0594] The subject nucleic acids can be used to generate transgenic, non- 
human animals and/or site-specific gene modifications in cell lines; suitable methods 
are known in the art (Grosveld and Kollias, 1992; Hooper, 1993; Murphy and Carter, 
1993; Pinkert, 1994). Thus, in some embodiments, the invention provides a non- 
human transgenic animal comprising, as a transgene integrated into the genome of the 
animal, a nucleic acid molecule comprising a sequence encoding a subject 
polypeptide in operable linkage with a promoter, such that the subject polypeptide- 
encoding nucleic acid molecule is expressed in a cell of the animal. Either a complete 
or partial sequence of a gene native to the host can be inti-oduced. Alternatively, a 
complete or partial sequence of a gene exogenous to the host animal, e.g., a human 
sequence of the subject invention, can be introduced. Transgenic animals can be 
made through homologous recombination, where the endogenous locus is altered. 
Thus, DNA constructs for homologous recombination will comprise at least a portion 
of the human gene or of a gene native to the species of the host animal, wherein the 
gene has the desired genetic modification(s), and includes regions of homology to the 
target locus. Methods for generating mammalian cells having targeted gene 
modifications through homologous recombination are known in the art (Keown et al., 
1990). 

[0595] Alternatively, a nucleic acid construct is randomly integrated into the 
genome. Vectors for stable integration include plasmids, retroviruses and other 
animal viruses, and YACs. DNA constructs for random integration need not include 
regions of homology to mediate recombination. 

[0596] Conveniently, markers for positive and negative selection are 
included. A detectable marker, such as /oc Z can be introduced into a locus at which 
up-regulation of expression will result in a detectable change in phenotype. 
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[0597] Transformed ES or embryonic cells can be used to produce 
transgenic animals. An embryonic stem (ES) cell line can be a source of embryonic 
stem cells, or they can' be newly obtained from a host animal, e.g., a mouse, rat, or 
guinea pig. The cells are grown on an appropriate fibroblast-feeder layer or in the 
presence of leukemia inhibiting factor (LIF). Following transformation, the cells are 
plated for growth onto a feeder layer in an appropriate medium. Cells containing the 
relevant construct can be detected by employing a selective medium and analyzing 
them for the occurrence of homologous recombination or integration of the construct. 
Positive colonies can be used for embryo manipulation and blastocyst injection. 
Blastocysts are obtained from 4 to 6 week old super-ovulated females. The ES cells 
are trypsinized, and the modified cells are injected into the blastocoel of the 
blastocyst. After injection, the blastocysts are returned to each uterine horn of 
pseudopregnant female animals that proceed to term. The resulting offspring are 
screened for the constmct. By providing for a different phenotype of the blastocyst 
and the genetically modified cells, chimeric progeny can be readily detected. 

[0598] The chimeric animals are screened for the presence of the modified 
gene and males and females having the modification are mated to produce 
homozygous progeny. If the gene alterations cause lethality at some point in 
development, tissues or organs can be maintained as allogeneic or congenic grafts or 
transplants, or in in vitro culture. The tiransgenic animals can be any non-human 
mammal. 

[0599] The modified cells or animals are useftil in the study of gene 
ftinction and regulation. For example, a series of small deletions and/or substit-utions 
can be made in the host% native gene to determine the role of different exons in 
biological processes such as oncogenesis or signal transductioa Of interest is the use 
of genes to construct transgenic animal models for cancer, where expression of the 
subject protein is specifically reduced or absent Specific constructs of interest 
include anti-sense constiiicts, which will block expression, expression of dommant 
negative mutations, and gene over-expression. 

[0600] One can also provide for expression of the gene, e.g., a subject gene, 
or variants thereof, in cells or tissues where it is not normally expressed, at levels not 
normally present in such cells or tissues, or at abnormal times of development. One 
can also generate host cells (including host cells in transgenic animals) that comprise 
a heterologous nucleic acid molecule which encodes a polypeptide which functions to 
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modulate expression of an endogenous promoter or other transcriptional regulatory- 
region, or the biological activity of a subject polypeptide. ^ 

[0601] The transgenic animals can also be used in functional studies, for 
example drug screening, to determine the effect of a candidate drug on a biological 
activity of a subject polypeptide. 
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Table 1. Characteristics of the Fantom Mouse Protein With the Highest Degree 
of Similarity to the Claimed Sequences 



FPID 


Fantom Top Hit Annotation 


HG1000214N0 ,160000 gene_j)redictio 
nl 


pre-B lymphocyte gene 1 [Mus musculus] 


HG1000323N0 160000 genejredictio 
nl 


lipoprotein lipase [Mus musculus] 


HGl 000323N0 1 60000_gene_predictio 
n2 


similar to procollagen, type V, alpha 2 [Mus 
musculus] 


HG1000327N0_l'000_gene_predictionl 


unnamed protein product [Mus musculus] 


HGl 000327N0_1 60000_gene_i)redictio 
nl 


unnamed protein product [Mus musculus] 


HG1000434N0 160000_gene_predictio 
nl 


uromodulin; Tamni-Horsfall glycoprotein [Mus 
musculus] 


HG1000449N0 160000_gene_predictio 
nl 


trefoil factor 1 [Mus musculus] 


HG1000807N0 •160000_gene_predictio 
nl 


IGFBP-like protein [Mus musculus] 


HG1000807N0_5000 _gene_predictionl 


gi|9055246|reflNP_06121 1.1| IGFBP-like 
protein [Mus musculus] 


HG1001280N0 160000 gene_predictio 
nl 


gi|26336763|dbj|BAC32064.1| unnamed protein 
product [Mus musculus] 


HG1000193N0 160000 _gene_predictio 
nl 


gi|21 59501 l|gb|AAH3 1409.1] RIKEN cDNA 
2410030007 gene [Mus musculus] 


HG1000286N0 160000 gene_predictio 
nl 


gi|303678|dbj|BAA02298.1| 47-kDaheat shock 
protein [Mus musculus] 


HG1000569N0 160000_^ene_predictio 
nl 


gi|20881983|reflXP_122793.1| similar to heat- 
stable antigen-related hypothetical protein 

HSA-C - mouse [Mus musculus] 


HG1000992N0 160000_gene_predictio 
nl 


gi|263 3 1 91 6|dbj |BAC29688.1 1 unnamed protein 
product [Mus musculus] 


HG 1 00 1 1 48N0_1 60000_gene_predictio 
nl 


gi|6752962|reflNP_033744.1| a disintegrin and 
metalloprotease domain 15 (metargidin); a 
disintegrin and metalloproteinase domain 


HG1001185N0 160000_gene__predictio 
n2 


gi|26329785|dbj|B AC28631.il unnamed protein 
product [Mus musculus] 


HG1001280N0_5000_gene_predictionl 


gi|26336763|dbj|BAC32064.1| unnamed protein 
product [Mus musculus] 


HG1001302N0 160000_gene_predictio 
n2 


gi|20136122|gb|AAMl 1539.11 matrilin-2 [Mus 
musculus] 


HGl 00036 INO 160000_gene_predictio 
nl 


gi|20867549|reflXP_125932.1| RIKEN cDNA 
9030421L11 [Mus musculus] 
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HG1000361N0 20000_gene_prediction 


gi|26330472|dbj|BAC28966.1| unnamed protein 
product [Mus musculus] 


HG1000792N0 160000 _gene_predictio 
nl 


gi|27229118|reflNP_082129.2| RIKEN cDNA 
)610006F02 [Mus musculus] 


HG1000934N0_1 60000 _gene_predictio 
nl 


903 0421 LI 1 [Mus musculus] 


HG1000976N0 160000_gene_predictio 
nl 


mil 1 Qr'i7Q(^^lrpfl>JP 071 87Q 1 1 rvtnrhrome 
P450, subfamily IVF, polypeptide 14 
(leukotriene B4 omega hydroxylase) [Mus 
musculus] 


HG1000992N0 10000_^ene_prediction 
1 


gi|2633 1 9 1 6|dbj|BAC29688. 1 1 unnamed protein 
product [Mus musculus] 


HGl GO 11 85N0_1 000_gene_predictionl 


gi|26329785|dbj|BAC28631.1| unnamed protein 
product [Mus musculus] 


HG1001185N0 160000_gene_predictio 
nl 


gi|26329785|dbj|BAC28631.1| unnamed protein 
product [Mus musculus] 


HGl 00 1 1 85N0_1 000_gene_prediction2 


gi|26329785|dbj|B AC28631.il unnamed protein 
product [Mus musculus] 


HGl 00 1 1 85N0_5000 _gene_predictionl 


gi|26329785|dbj|B AC28631.il unnamed protein 
product [Mus musculus] 


HG1001280N0 10000_gene_prediction 
1 


gi|26336763|dbj|BAC32064.1| unnamed protein 
product [Mus musculus] 


HG1000361N0 10000_gene_prediction 
1 


gi|26330472|dbj|BAC28966.1| unnamed protein 
product [Mus musculus] 


HG1001381N0_1000 _gene_predictionl 


gi|26343077|dbj|BAC35195.1| unnamed protein 
product [Mus musculus] 


HG1000263N0_5000 _gene_predictionl 


gi|26360198|dbj|BAB25612.2| unnamed protein 
product [Mus musculus] 


HGl 00 1 052N0_0_gene_predictionl 


gi|20072693|gb| AAH27297.il Similar to cyclin 
K [Mus musculus] 


HGl 000498N0_1 60000_gene_predictio 
nl 


gi|26352844|dbj|BAC40052.1| unnamed protein 
product [Mus musculus] 


HG1000579N0_160000_gene_predictio 
nl 


gi|26330550|dbj IBAC29005.il unnamed protein 
product [Mus musculus] 


HG1000685N0 160000_gene_predictio 
nl 


gi|6753236|reflNP_033915.1| calcium channel, 
voltage dependent, alpha2/delta subunit 3; 
alpha 2 delta-3 [Mus musculus] 


HG1000191N0 160000_gene_predictio 
nl 


gi|13385832|reflNP_080608.1| RIKEN cDNA 
1810055D05 [Mus musculus] 


HGl 000296N0_1 60000_gene_predictio 
n2 


gi|25054735|reflXP_192839.1| ATPas, class II, 
type 9B [Mus musculus] 


HGl 000346N0_1 000_gene_predictionl 


gi|26330504|dbj|BAC28982.1| unnamed protein 
product [Mus musculus] 


HGl 000963N0_5000_^ene_predictionl 


gi|12963665|reflNP 075892.11 mesoderm 
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development candiate 2; RIKEN cDNA 
2210015011 gene [Mus musculus] 


HGl 0006 1 0N0_1 60000_sene_j)redictio 
nl 


gi|26335037|dbj|BAC31219.1| unnamed protein 
product [Mus musculus] 


HG100b342N0_160000_gene_predictio 
nl 


ji|ZUooiyoj|rei|y\J:_izz/i'j.i| ainuior lo iicdi- 
stable antigen-related hypothetical protein ' 
HSA-C - mouse [Mus musculus] 


HG1000342N0 160060_gene_predictio 
n2 ■ " • 


gil20881983|reflXP_122793.1| similar to heat- 
stable antigen-related hypothetical protein 
HSA-C - mouse [Mus musculus] 


HG1000650N0 50000 _gene_prediction 
1 


gi|20270210|reflNP_083847.1| RIKEN cDNA 
1 1 1 000 1 A 1 2 [Mus musculus] 


HG 1 000 1 91N0 l60000_^ene_predictio 
n2 


gi|13385832|reflNP_080608.1| RIKEN cDNA 
1810055D05 [Mus musculus] 


HG 1 000449N0_1 60000_gene_predictio 
n3 


gi|6755773|reflNP_035705.1| trefoil factor 3, 
intestinal [Mus musculus] 


HG1000181N0_20000lgene_prediction 
I 


gi|26334755|dbj|BAC3 1078. 1 1 unnamed protein 
product [Mus musculus] 


HG1001058N0 160000 gene predictio 
nl 


gi|20344262|reflXP_l 10959.1] similar to 
LD31582p [Drosophila melanogaster] [Mus 
musculus] 


HG1000187N0 160000 gene_predictio 
n2 • . 


gi|26346705|dbj|BAC37001.1| unnamed protein 
product [Mus musculus] 


HG1000191N0 1000_gene_predictionl 


gi|13385832|reflNP_080608.1| RIKEN cDNA 
1810055D05 [Mus musculus] 


HG10003 19N0_160000_gene_predictio 
nl 


gi|25021456|reflXP_207950.1| similar to 
pORF2 [Mus musculus domesticus] 


HGl 0001 37N0 0 gene_predictionl 


gi|20843789|ref]XP_133814.1| similar to 
hypothetical protein IMAGE3455200 [Homo 
sapiens] [Mus musculus] 


HG1000191N0 5000_gene_predictionl 


gi|12842346|dbj|BAB25565.1| unnamed protein 
product [Mus musculus] 


HG1000622N0 160000_gene_predictio 
nl 


gi|25022040|ref|XP_204233.1| similar to 0RF2 
[Mus musculus domesticus] 


HGl 000390N0_1 000_gene_predictionl 


gi|20892585|reflXP_147977.1| RIKEN cDNA 
2610001E17 [Mus musculus] 


HG1001350N0 5000 gene_predictionl 


gi|13386102MNP_080892.1| RIKEN cDNA 
1500026D16 [Mus musculus] 


HG1000327N0 160000 gene_predictio 
n2 


gi|26324414|dbj|BAC25961.1| unnamed protein 
product [Mus musculus] 


HG 1 000 1 79N0_1 60000_gene_predictio 
nl 


gi|2{)8621 2 l|re:^XP_l 46270.1] similar to 
putative alpha 1,3-fucosyl transferase [Mus 
musculus] 


HG1000806N0 20000 gene prediction 


Ki|23592855|reflXP 129487.21 hypothetical 
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1 


protein MGC40674 [Mus musculus] 


HGl 00099 1N0_1 60000_gene_predictio 
nl 


gi|6755338|reflNP_036013.1| ring finger 
jrotein 13 [Mus musculus] 


HGl 001489N0_20000_^ene_prediction 


gi|23592855|ref|XP_129487.2| hypothetical 
protein MGC40674 [Mus musculus] 


HG1001038N0_5000_gene_predictionl 


gi|20892051|reflXP_l 48657.11 similar to 
Lethal(2)neighbour of tid protein 2 (NOT53) 
"Mus musculus] 


HGl 001 376N0_160000_genejpredictio 
n2 


gi|27261816|reflNP_080861.1| RIKEN cDNA 
C530005J20 [Mus musculus] 


HG 1 00 1 3 76N0_20000_gene_prediction 
2 


gi|27261816|reflNP_080861.1| RIKEN cDNA 
C530005J20 [Mus musculus] 


HG1001478N0 10000 gene__prediction 
1 


gi|6979907|gb|AAF34647.1|AF221103_l 
kinesin-related protein KIFC5B [Mus 
musculus] 


HG1000806N0 160000 gene_predictio 
nl 


gi|23592855|ref|XP_l 29487.2| hypothetical 
protein MGG40674 [Mus musculus] 


HG1000409N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000884N0 160000 gene_predictio 
nl 


gi| 263 2905 5 |dbj |B AC28266. 1 1 unnamed protein 
product [Mus musculus] 


HG1000575N0 160000 gene_predictio 
nl 


gil20889984|reflXP 129281.1] RiKEN cDNA 
49305 3 8D 17 [Mus musculus] 


HG1000403N0 160000 gene_predictio 
nl 


gi|26340 168|dbj |BAC33747. 1 1 unnamed protein 
product [Mus musculus] 


HGl 000906N0_1 0000_gene_prediction 


gi|20836822|reflXP_l 30277.1] similar to 
Plakophilin 4 (p0071) [Mus musculus] 


HGl 001 20 lN0_160000_genejpredictio 
nl 


gi] 263 4 1 746 1 dbj ] B AC 3 45 3 5 . 1 ] unnamed protein 
product [Mus musculus] 


HG1000485N0 160000 gene_predictio 
nl 


gi]23597904]ref]XP_129263.2] protein 
phosphatase 1, regulatory (inhibitor) subunit 3C 
[Mus musculus] 


HG1000328N0 160000 genejpredictio 
nl 


gi]263 3673 1 ]dbj ]BAC3 2048. 1 ] unnamed protein 
product [Mus musculus] 


HGl 00023 m0_160000_gene_predictio 
nl 


gi]26341 3 1 2]dbj]BAC343 1 8. 1 ] unnamed protein 
product [Mus musculus] 


HG1001257N0_10000_gene_prediction 
1 


gi]26346593]dbj]BAC36945.1| unnamed protein 
product [Mus musculus] 


HGl 000026N0_5000_genejpredictionl 


gi]9506367]ref]NP 062425.1] ATP -binding 
cassette, sub-family B, member 10; ATP- 
binding cassette, sub-family B (MDR/TAP), 
member 12; Abc-mitochondrial erythroid [Mus 
musculus] 


HG1000300N0 160000 gene medictio 


gil 1 28462441dbi 1BAB27089. 1 1 unnamed protein 
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nl 


product [Mus musculus] 


HG1000109N0 160000 genejpredictio 
nl ' " . ~ 


gi|22779909|reflNP_690028.1| RIKEN cDNA 
2700083B01 [Mus musculus] 


HG100b617N0_20000 _gene_prediction 
1 


gi|79491 15|ref|NP_058079.1| Ser/Arg-related 
nuclear matrix protein; plenty-of-prolines-101 ; 
serine/arginine repetitive matrix protein 1 [Mus 
musculus] 


HG 1 00 1 1 1 QNO^l 60000__gene jpredictio 
nl 


gi|22779909|reflNP_690028.1| RIKEN cDNA 
2700083B01 [Mus musculus] 


HG1001334N6' 160000 _gene_predictio 
nl 


gi|26332062|dbj |BAC2976 1 . 1 1 unnamed protein 
product [Mus musculus] 


HG1001376N0_160000 _gene_predictio 


gi|2726 1 8 1 6|reflNP_08086 1 . 1 1 RIKEN cDNA 
C530005J20 [Mus musculus] 


HG1000026N0 20000. gene_prediction 
1 


gi|9506367|reflNP_062425.1 1 ATP-binding 
cassette, sub-family B, member 10; ATP- 
binding cassette, sub-family B (MDR/TAP), 
member 12; Abc-mitochondrial erythroid [Mus 
musculus] 


HG1000276N0 1000 _^ene_predictionl 


gi|19527228|ref|NP_598768.1| DNA segment, 
Chr 10, ERATO Doi 214, expressed [Mus 
musculus] 


HG1000822N0 160000 ^e!ne_predictio 
n2 


gi|6680195|ref]NP 032255.1] histone 
deacetylase 2; DNA segment, Chr 10, Wayne 
State University 179, expressed [Mus 
musculus] 


HG1000173N0_20000_^ene_prediction 
1 


gi|263451 10|dbj|BAC36204.1| unnamed protein 
product [Mus musculus] 


HG1000834N0_160000 _gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1001044N0 lOOO^genejpredictionl 


gi|26330836|dbj|BAC29148.1| unnamed protein 
product [Mus musculus] 


HG1000299N0 1000_genejpredictionl 


gi|6753882|reflNP_034349.1| FK506 binding 
protein 4 (59 kDa) [Mus musculus] 


HG1000752N0_1 0000_gene_prediction 
1 


gi|25955698|gb|AAH40387.1| Similar to 
PTPLl-associated RhoGAP 1 [Mus musculus] 


HG1000839N0_1 60000 _gene_predictio 
n2 


gi|17512422|gb|AAH19171.1| Similar to 
RIKEN cDNA 2310010G13 gene [Mus 
musculus] 


HG1000659N0_160000_gene jpredictio 
nl 


gi|26333733|dbj|BAC30584.1| unnamed protein 
product [Mus musculus] 


HG1000659N0 160000 gene_predictio 
n2 


gi|26333733|dbj|BAC30584.1| unnamed protein 
product [Mus musculus] 


HGl 00001 3N0 160000_gene_predictio 
nl 


gi|20881136|ref|XP_126284.1| similar to spenn 
antigen HCMOGT-1 [Homo sapiens] [Mus 
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musculus] 


HG1000173N0 160000 _gene_predictio 
nl 


gi|263451 10|dbj|BAC36204.1| unnamed protein 
product [Mus musculus] 


HG1000330NO_160000 _gene_predictio 
nl 


/ 40Zo JZ|gD|AAU 1 JOUD . 1 1 Ar 40Z Ji Jl 

modulator of estrogen induced transcription 
Mus musculus] 


HG1000360N0 20000 gene_prediction 
1 


gi|7861746|gb|AAF70384.1|AF189263_l 
GABA-A receptor epsilon-like subunit [Mus 
musculus] 


HG 1 000 1 78N0_1 0000_gene jredi ction 


gi|13384830|reflNP_079706.1| RIKEN cDNA 
1 1 10066C01 [Mus musculus] 


HG1000178N0_10000_gene_prediction 
2 


gi|13384830|ref|NP_079706.1| RIKEN cDNA 
1110066C01 [Mus musculus] 


HG1000360N0_20000_gene_prediction 
2 


gi|786 1 746|gb| AAF70384. 1 1 AF 1 89263_1 
GABA-A receptor epsilon-like subunit [Mus 
musculus] 


HG 1 000640N0_1 60000_gene_predictio 
nl 


gi|21313034|reflNP_080346.1| RIKEN cDNA 
2900091E11 [Mus musculus] 


HGIOOIOOONO 160000_gene_predictio 
nl 


gi|10181212|reflNP_065613.1| RIKEN cDNA 
1300007B12; clone MNCb-2755 [Mus 
musculus] 


HG1001418N0 160000 gene_predictio 
nl 


gi|20819462|reflXP 158058.1] hypothetical 
protein XP„158058 [Mus musculus] 


HG1000153N0 20000 gene_prediction 
1 


gi|26379523|dbj|BAB29070.2| unnamed protein 
product [Mus musculus] 


HG1000255N0 160000_genejpredictio 
nl 


gi|13385532|reflNP 080303.1] RIKEN cDNA 
2700086123 [Mus musculus] 


HG1000186N0 160000 genejpredictio 
nl 


gi|20963196]reflXP 135684.1] RIKEN cDNA 
1 700022L20 [Mus musculus] 


HG1000259N0 160000_gene_predictio 
nl 


gi|26360198]dbj|BAB25612.2| unnamed protein 
product [Mus musculus] 


HG1000559N0 10000 genejprediction 
1 




HG 1 000084N0 1 0000_gene jprediction 


gi|6678794]ref|NP_032953.1] mitogen activated 
protein kinase kinase 1; MAP kinase kinase 1; 
protein kinase, mitogen activated, kinase 1 , p45 
[Mus musculus] 


HG1000217N0 160000_genejpredictio 
nl 


gi|6681015|ref|NP 031789.1] cysteine rich 
intestinal protein [Mus musculus] 


HGl 00021 7N0_1 60000_gene_predictio 
n2 


gi]6681015]ref]NP_031789.1] cysteine rich 
intestinal protein [Mus musculus] 


HG 1 000329N0_1 60000_gene_predictio 
nl 


gi]26330870]dbj]BAC29165.1 ] unnamed protein 
product [Mus musculus] 


HG1000570N0 160000 gene predictio 


gi|67165221gb]AAF26675.11AF155821 1 



174 



wo 2005/005597 



PCT/US2003/027106 



FPID 


Fantom Top Hit Annotation 


nl 


CPG16 [Mus musculus] 


HG1000617N0 40000_genejprediction 
1 ' 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000227N0 160000 gene_predictio 
nl 


gl|/13D/4Uz|Sp|l^yCZ,l5U|UjOU_MUUoc, 

Succinate dehydrogenase cytochrome b560 
subunit, mitochondrial precursor (Integral ' 
membrane protein CII-3) (QPSl) (QPs-1) 


HG100026?N0_10000_gene_piediction 
1 


gi|7706341|reflNP_057145.1| yippee protein 
[Homo sapiens] 


HG1000615N0 r60000_genejpredictio 
n2 


gi|4506725|ref|NP_000998.1| ribosomal protein 
S4, X-liriked X isoform; 40S ribosomal protein 
S4, X isoform; ribosomal protein S4X isoform; 
single-copy abundant mRNA; cell cycle gene 2 

[Homo sapiens] 


HG1000617N0_160000_genejpredictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000621N0 160000_gene_predictio 
n2 


gi|4506725|ref|NP_000998.1| ribosomal protein 
S4, X-Hnked X isoform; 40S ribosomal protein 
S4, X isoform; ribosomal protein S4X isoform; 
single-copy abundant mRNA; cell cycle gene 2 
[Homo sapiens] 


HGI06099ONO 160000 _^enejpre(iictio 
nl 


gi| 1 0946760|reflNP_0673 8 1 . 1 1 triggering 
receptor expressed on myeloid cells 1 ; 
triggering receptor expressed in monocytes 1 
[Mus musculus] 


HG1000998N0 160000 _gene_predictio 
nl 


gi|6678483|reflNP 033483. 1| ubiquitin- 
activating enzyme El, Chr X [Mus musculus] 


HG1001225N0 160000 gene predictio 
nl 


gi|10181192|reflNP 065589. 1| sulfotransferase- 
related protein SULT-Xl [Mus musculus] 


HG1001269N0_5000_jene_predictionl 


gi|21311883|ref]NP 080887.1] RIKENcDNA 
0610007007 [Mus musculus] 


HG 1 00 1 269N0_1 60000_gene_predictio 
nl 


gi|21311883|reflNP_080887.1| RIKEN cDNA 
0610007007 [Mus musculus] 


HG1000103N0_160000_gene_preciictio 
nl 


gi|26327721 |dbj|BAC27604.1 1 unnamed protein 
product [Mus musculus] 


HG1000143N0_1 000_jene jjredictionl 


gi|14141 193|ref[NP_001004.2| ribosomal 
protein S9; 40S ribosomal protein S9 [Homo 
sapiens] 


HG1000396N0 160000 gene_predictio 
nl 


gi|25024769|reflXP_207 136.11 similar to 0RF2 
[Mus musculus domesticus] 


HG1001502N0 160000 gene_predictio 
n2 


gi|2144100|pir||I64837 Set beta isoform - rat 


HG1000066N0_160000_gene_predictio 
nl 


gi|26337951|dbj|BAC32661.1| unnamed protein 
product [Mus musciilus] 
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HG1000078N0_1000_gene_predictionl 


gi|26346587|dbj|BAC36942.1| imnamed protein 
product [Mus musculus] 


HG1000117N0 160000 gene_predictio 
nl 


gi|208755 80|ref|XP_l 3 1 1 62. 1 1 sorting nexin 7 
"Mus musculus] 


HG1000157N0 160000_gene_predictio 
nl 


gi|26344914|dbj|BAC36106.1| unnamed protein 
product [Mus musculus] 


HG1000194N0 160000 gene_predictio 

nl ~ ~ . 


gi|21313022|reflNP 083674.1] RIKEN cDNA 
5730496E24 [Mus musculus] 


HG1000501N0 160000 gene _predictio 
nl 


gi|27370478|reflNP 766552.1] hypothetical 
protein El 303 1 0N06 [Mus musculus] 


1 


gi|12855078]dbj|BAB3Q210.1] unnamed protein 
product [Mus musculus] 


xlKj 1 UWUO JOIN U 1 V/UUV/ ^CilC piCUll/UUii 

2 


gi] 1 2855078]dbj |BAB 3 02 1 0. 1 1 unnamed protein 
product [Mus musculus] 


HG1000750N0_160000_gene_predictio 


gi|26336392|dbj|BAC31881.1| unnamed protein 
product [Mus musculus] 


HGl 001 012N0_1 60000_gene_predictio 


gi|21312504|reflNP_081554.1] RIKEN cDNA 
2810432D09 [Mus musculus] 


1 


gi|20882986MXP_126218.1] similar to 
Hermansky-Pudlak syndrome protein variant 
[Rattus norvegicus] [Mus musculus] 


HG1000228N0 40000_gene_prediction 
1 


gi|26342390|dbj]BAC34857.1] unnamed protein 
product [Mus musculus] 


HG1000228N0 20000 gene prediction 
1 


gi|13507676|reflNP 109647.1] pumilio 1 
(Drosophila) [Mus musculus] 


HG1000228N0 160000_gene_predictio 
nl 


gi|13507676|reflNP_109647.1] pumilio 1 
(Drosophila) [Mus musculus] 


HG1000390N0 160000_gene_predictio 
nl 


gi|20892585]reflXP 147977.1] RIKEN cDNA 
2610001E17 [Mus musculus] 


HG1000409N0 10000_gene_prediction 
1 


gi|26006245]dbj]BAC41465.1 1 mKIAA1047 
protein [Mus musculus] 


HG1000611N0 160000 gene_predictio 
nl 


gi|6650539|gb]AAF21895.1|AF103877_l 
epsilon-sarcoglycan [Mus musculus] 


HG1000847N0 10000_gene_prediction 
1 




HG 1 0000 1 5N0_0_gene_predictionl 


gi|20467423]reflNP_620570. 1 1 chondroitin 
sulfate proteoglycan 4 [Mus musculus] 


HG1000088N0_5000_gene_predictionl 


gill6741633]gb]AAH16619.1| pyruvate kinase 
3 [Mus musculus] 


HG1000143N0_10000_gene_prediction 
1 


gi]20896345|ref|XP_128324. 1 ] carbonyl 
reductase 3 [Mus musculus] 


HG1000167N0_5000_gene_predictionl 


gi] 12848663|dbj]BAB28043. 1 1 unnamed protein 
product [Mus musculus] 
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HG1000243N0 5000_^ene_predictionl 


gi|8393534|reflNP_058653.1 1 high mobility 
group protein 17 [Mus musculus] 


HGl 000825N0_160000_genejpredictio 
nl 


gi|21311983|reflNP_080956.1| RIKEN cDNA 
0610012C01 [Mus musculus] 


HG1001019N0 1000 gene_predictionl 


gi|26343769|dbj |BAC3 5541.1 1 unnamed protein 
product [Mus musculus] 


HG1000044N0 160000 gene_predictio 
nl 


gi|15079309|gb|AAHl 1494.11 Similar to 
Vlyosin of the dilute-myosin-V family [Mus 
musculus] 


HGIOOOIOONO' 10000 gene_prediction 
1 


gi|4506127|ref|NP_002755. 1 1 phosphoribosyl 
pyrophosphate synthetase 1 [Homo sapiens] 


HG1000149N0 160000_gaie_predictio 
nl 


gi| 1 2834813|dbj|BAB23054.1 1 unnamed protein 
product [Mus musculus] 


HG1000183N0 1000_gene_predictionl 


gi|273701 50|reflNP_766364. 1 1 hypothetical 
protein D630002G06 [Mus musculus] 


HG1000183N0_160000_gene_predictio 
n2 


gi|273701 50|ref|NP_766364. 1 1 hypothetical 
protein D630002G06 [Mus musculus] 


HG1000213N0_5000 _gene_predictionl 


gi|6753178|reflNP_035923.1| breakpoint cluster 
region protein 1^ barrier to autointegration 
factor [Mus musculus] 


HG1000294N0_5000_gene_predictionl 


gi| 1 8390327|reflNP_083 908 . 1 1 protein 
phosphatase 1, regulatory (inhibitor) subunit 
1 1 j t-coniplex testis-expressed 5 [Mus 
musculus] 


HGl 00033 INO 160000 genejpredictio 
nl 


rri'lonSAOROdlrf^flYP 1410^1 1 1 «;iinilflr tn ??lit 
homolog 1 (Drosophila); slit (Drosophila) 
homolog 1; slitl [Homo sapiens] [Mus 
musculus] 


HG1000391N0 160000 gene_predictio 
n2 


gi|20887543|reflXP_134475.1| RIKEN cDNA 
2310022B05 [Mus musculus] 


HG1000430N0 160000_genejpredictio 
nl 


gil26382861|dbj|BAC25510.1| unnamed protein 
product [Mus musculus] 


HG1000597N0_160000_gene_predic1io 
nl 


gi|26325886|dbj|B AC26697.il unnamed protein 
product [Mus musculus] 


HG1000078N0 5000_gene_predictionl 


gi|26346587|dbj|BAC36942.1| unnamed protein 
product [Mus musculus] 


HG 1 000 1 39N0_5000_genejpredictionl 


gi|23597632|reflXP_127052.2| similar to 
hypothetical protein FLJ13920 [Homo sapiens] 
[Mus musculus] 


HG 1 000 1 43N0_1 60000_gene_predictio 
nl 


gi|20896345|reflXP_l 28324. 1 1 carbonyl 
reductase 3 [Mus musculus] 


HG1000162N0 160000_gene_predictio 
nl 


gi|20835770|reflXP 132127.1] similar to 60S 
RIBOSOMAL PROTEIN L13 [Mus musculus] 


HG1000168N0 160000 gene predictio 


gi|12841593|dbilBAB25272.1| unnamed protein 
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nl 


product [Mus musculus] 


HG 1 0001 87N0_1 60000_gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG 1 000247N0_1 60000_gene_predictio 
nl 


gi|7656920|reflNP_056547.1| axm2 [Mus 
musculus] 


HG100O273N0 160000_gene_predictio 
n2 


gi|25030042|reflXP_207307.1| similar to 

O/a+T-nxn-nic -rfalptf^H POT ■nnlvnmff^in riVflT^ 
LS-cXTOVirUo-TclaLCU. r\J,l^ puiypiuLCiii |_iv±uo 

musculus] 


HGl 00041 5N0 10000_gene_prediction 


rri\Q1AnQA(\\o-rryU\n A^TiQT^TX 11 h\mr>thpiira\ 

ppjO / o4U|eniDH^AiJi' / jz J. 1 1 uypuuicm^ai 
protein, weakly similar to (AF102871) neuronal 
apoptosis inhibitory protein 2 [Mus musculus] 
[Homo sapiens] 


HG1000539N0 160000_gene_predictio 
nl 


gi|7521942|pir||T29096 gag polyprotein - 
murine endogenous retrovirus ERV-L 


HG1000539N0 160000_gene_predictio 
n2 


gi|7521942|pir||T29096 gag polyprotein - 
murine endogenous retrovirus ERV-L 


HG1000560N0 160000 gene_predictio 
nl 


gi|12860683|dbj|BAB32021.1| unnamed protein 
product [Mus musculus] 


HG 100061 8N0 10000 gene_prediction 
1 ~ ~ . 


gi|26350749|dbj|BAC3901 1.1| unnamed protein 
product [Mus musculus] 


HG1000740N0 160000_gene_predictio 
nl 


gi|23601536|reflXP_130965.2| Nice-4 protein 
homolog [Mus musculus] 


HG1001197N0_1 60000 _gene_predictio 
nl 


gi|26327779|dbj|BAC27630.1| unnamed protein 
product [Mus musculus] 


HG1000599N0 5000_gene_predictionl 


gi|12836542|dbj|BAB23701.1| unnamed protein 
product [Mus musculus] 


HG 1 000020N0_5000_gene_prediction 1 


gi|208871 01 |reflXP_l 29228. 1 1 similar to 
phosphoglucomutase 5 [Homo sapiens] [!Mus 
musculus] 


HG1000084N0_5000 _gene_predictionl 


rn"l<^/^'787Q41r(>fl>JP n'^9Q'>'^ 1 1 mitncrpii activated 
protein kinase kinase 1; MAP kinase kinase 1; 
protein kinase, mitogen activated, kinase 1, p45 
[Mus musculus] 


HGl 0001 35N0_5000_genejpredictionl 


gi|21312189|reflNP_081 197.1| RIKEN cDNA 
181001 0A06 [Mus musculus] 


HGl 0001 69N0_20000_genejprediction 
1 


gi|20886743|reflXP_12921 1 .1 1 phosphoserine 
aminotransferase [Mus musculus] 


HGl 0001 69N0 160000 gene_predictio 
nl 


gi|20886743|reflXP_12921 1 .1 1 phosphoserine 
aminotransferase [Mus musculus] 


HGlOOOl 89N0_160000_gene_predictio 
nl 


gi|20879992|reflXP_140210.1 1 similar to 
BG:DS01759. 1 gene product [Drosophila 
melanogaster] [Mus musculus] 


HGl 0001 89N0 160000 _gene_predictio 
n2 


gi|20879992|reflXP_140210.1| similar to 
BG:DS01759.1 gene product [DrosopMla 
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melanogaster] [Mus musculus] 


HG1000246N0 5000_gene_predictionl 


GalNAc:polypeptide N- 
acetylgalactosaminyltransferase [Mus 
musculus] 


HG1000248N0 O_geaie_predictionl 


gi|9790219|reflNP_062745.1| destrin; Sid23p 
Mus musculus] 


HG1000288N0 10000_gene_prediction 
1 


gi|209095 1 2|reflXP_l 53447.1 1 hypotlietical 
jroteinXP 153447 [Mus musculus] 


HGl 000424N0_5000_5ene_prediction 1 


gi|2503 1 822|reflXP_207741 .1 1 hypothetical 
protein XP_207741 [Musmuscdus] 


HG1000443N0_40000 _^ene_piediction 
I 


gi|26354072|dbj|BAC40666.1| unnamed protein 
product [Mus musculus] 


HG 1 000590N0_1 000_gene_predictionl 


gij26378096|dbj|BAB28595.2| unnamed protein 
product [Mus musculus] 


HG1000626NO_160000 _gene_predictio 
nl 


gi|yyjoUjU|rei|iNJr_uo*too / . 1 1 uypuiuou^oi 
protein, MNCb-4193; hypothetical protein 
MNCb-4193 [Mus musculus] 


HG1000871N0 160000_gene_predictio 
nl 


gi|6752958|reflNP_033742.1| activin A 
receptor, type Il-like 1 ; activin receptor-like 
kinase-1 [Mus musculus] 


HG1000959N0 10000_gene_prediction 
1 


gi|22507385|reflNP_081019.1| RIKEN cDNA 
1 1 10014F12 [Mus musculus] 


HGl 00096 1N0_1 60000 _gene_predictio 
n3 


gi|20822904|reflXP_l 3 1914.11 RIKEN cDNA 
3 1 1 00040 1 8 [Mus musculus] 


HG1000974N0 5000 gene_predictionl 


gi|26378096|dbj|BAB28595.2| unnamed protem 
product [Mus musculus] 


HG1001045N0 160000 gene_predictio 
nl 


gi|25020138|reflXP_207789.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 


HGl 00 1 1 1 0N0_0_gene_predictionl 


gi|23956080|reflNP_058675.1 1 putative 
serine/threoiune kinase [Mus musculus] 


HGl 00 1223N0_1000_gene_predictionl 


gi|26339658|dbj|BAC33500.1 1 unnamed protein 
product [Mus musculus] 


HG1001281N0 160000_gene_predictio 
nl 


gi|154312791reflNP_203538.1| dedicator of 
cyto-kinesis 2 [Mus musculus] 


HG1001317N0 5000_gene_predictionl 


gi|26327365|dbj|BAC27426.1| unnamed protein 
product [Mus musculus] 


HG1001485N0_5000_gene_predictionl 


gi|26327365|dbj|BAC27426.1 1 uimamed protein 
product [Mus musculus] 


HG1000674N0_160000_gene_predictio 
nl 


gi|242 1 1 88 1 |sp|Q8VCR8|KML2_MOUSE 
Myosin light chain kinase 2, skeletal/cardiac 
muscle (MLCK2) 


HG1001017N0 10000 gene prediction 


gi|25019831|reflXP 207463.11 similar to 
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I 


CD59B [Mus musculus] 


HG1001017N0 1000_gene_predictionl 


gi|250 1 983 1 |reflXP_207463 . 1 Isimilar to 
CD59B [Mus musculus] 


HG1000014N0 160000_gene_pre(iictio 
n2 


gi|6680744|reflNP_031528.1| ATPase, Na+/K+ 
ransporting, beta 3 polypeptide; ATPase, 
Na+/^+ beta 3 polypeptide [Mus musculus] 


HG1000043N0 160000 gene_predictio 
n3 


gi|26337385|dbj|BAC3 2378.1] unnamed protein 
product [Mus musculus] 


HGl 00005 2N0 160000 gene_predictio 
nl 


gip3yyizU|gD|AAL//z/yj.i| uKr/ [ivius 
musculus domesticus] 


HG1000084N0 5000 _gene_prediction2 


gl|00 /o /y4|rei|lNi^ UOZyjJ.l] lUUOgcn at/UVaLCU 

protein kinase kinase 1; MAP kinase kinase 1; 
protein kinase, mitogen activated, kinase 1 , p45 
[Mus musculus] 


HG 1 000093N0_1 000_gene_predictionl 


gi|26350865|dbj|BAC39069.1| unnamed protein 
product [Mus musculus] 


HG1000105N0 I60000_gene_predictio 
nl 


gi|14198371|gb|AAH08247.1| Similar to cyclin 
B2 [Mus musculus] 


HG1000157N0 1000 gene_predictionl 


gi|5803225|reflNP_006752.1| tjTosine 
3/tryptophan 5 -monooxygenase activation 
protein, epsilon polypeptide; 14-3-3 epsilon; 
mitochondrial import stimulation factor L 
subunit; protein Idnase C inhibitor protein- 1 
[Homo sapiens] 


HG1000210N0_40000_gene_prediction 
I 


gill 71 60840|gb| AAH17597.il RIKEN cDNA 
5830401B18 gene [Mus musculus] 


HG1000242N0 5000_gene_predictionl 


gi|9789937|reflNP_062768.1| DnaJ (Hsp40) 
homolog, subfamily A, member 2; DNA J 
protein [Mus musculus] 


HG1000243N0 5000_gene_prediction2 


gi|8393534|reflNP_058653.1| high mobility 
group protein 1 7 [Mus musculus] 


HG1000256N0 160000 gene_predictio 
nl 


gi|13959400|sp|Q9R0Y5|KADl_MOUSE 
Adenylate kinase isoenzyme 1 (ATP-AMP 
transphosphorylase) (AKl) (Myokinase) 


HGl 000279N0_0 jgene_predictionl 


gi| 1 561 7203 |ref|NP_254279. 1 1 chloride 
intracellular channel 1 [Mus musculus] 


HGl 000280N0_5000_gene_predictionl 


gi|7106337|reflNP_034796.1| keratin complex- 
1, gene C29 [Mus musculus] 


HG1000280N0 5000 gene_prediction2 


gi|7106337|reflNP_034796.1| keratin complex- 
1, gene C29 [Mus musculus] 


HGl 000282N0_1 60000_gene_predictio 
nl 


gi|20902823|re:qXP_128021.1| similar to 
Mitochondrial import receptor subunit TOM22 
homolog (Translocase of outer membrane 22 
kDa subunit homolog) (hTom22) (1C9-2) [Mus 
musculus] 
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HGl 000292N0_160000_gene_predictio 
nl 


gi|6981488|reflNP_037356.1| ribosomal protein 
S26 [Rattus norvegicus] 


HGl 0003 13N0 160000 gene_predictio 
nl 


gi|4506283|reflNP_003454.1| protein tyrosine 
)hosphatase type IVA, member 1 ; Protein 
yrosine phosphatase IVAl [Homo sapiens] 


HG1000330N0 20000^ene_prediction 
1 


gi|22 1 225 1 1 |reflNP_666 1 46. 1 1 hypothetical ' 
protein MGC30562 [Mus muscidus] 


nl • ■ . 


gil26350551|dbj|BAC38915.1| unnamed protein 
product [Mus musculus] 


wni nnO'^d.msJn* 1 fiOnnO apne nredictio 

xliJl UUUJH'UiNV/ lUV/UUV/ _^CilC |JlPUH^u.u 

nl 


gi|20912842|reflXP 126689.1] RKEN cDNA 
330000 1 P08 [Mus musculus] 


HG1000344N0 160000 gene_predictio 
nl 


gi|21 45023 9|reflNP 659092.1] hypothetical 
protein MGC27983 [Mus musculus] 


1 


gi]25046794]ref]XP 207489.1] similar to RNP 
particle component [Mus musculus] 


nnO'^SJJ.TSJn I^^OOOO crpnf» nredictio 

XTVJT 1 UUU JOH-INU 1 \J\J\/V\J ji^OllO piCUlVllVJ 

nl 


gi]20909520]reflXP 126941.1] RIKEN cDNA 
260001 1C06 [Mus musculus] 


Tjn 1 fv\f\A A QTvTn lAOOnn apne nredirtin 
rivjr 1 uuUt^toIN w 1 uuu vU goiic_^i cui^ 

nl 


gi]6678247|reflNP 033358.1] transcription 
factor 7-Iike 1 [Mus musculus] 


nl 


gi 263 3 4795 dbj B AC3 1 098 . 1 ] unnamed protein 

product [Mus musculus] 


HG1000486N0 20000 gsns prediction 
1 


gil26350551]dbj|BAC38915.1] unnamed protein 
product [Mus musculus] 


HG1000506N0 160000 gene_predictio 
nl 


gi]20909520]reflXP_126941.1] RIKEN cDNA 
260001 1 C06 [Mus musculus] 


HG 1 0005 1 8N0_1 60000_gene_predictio 


gi]26351279]dbj]BAC39276.1] unnamed protein 
product [Mus musculus] 


HG1000550N0 160000_gene_predictio 
nl 


gi]20909520|reflXP_126941.1] RIKEN cDNA 
260001 1C06 [Mus musculus] 


HG1000556N0 160000_gene_predictio 
nl 


gi|2jUj I4y /|rei|Ar_zu / jj/.i 1 sumiar lo 
Retrovirus-related POL polyprotein.[Mus 
musculus] 


HG1000588NO 160000_gene_predictio 
nl 


gi|li/// /4 /|gC)|AArlUj /Oo.i| mterieron- 
induced protein with tetratricopeptide repeats 1 
[Mus musculus] 


HG1000600N0_160000_gene_predictio 
nl 


gi|2UoDJj /o|rei|Ar_l j4i4o.i| sunuar TO 
hypothetical protein [Macaca fascicularis] [Mus 
musculus] 


HGl 000647N0_1 60000_gene_predictio 
nl 


gi]9506517|reflNP_062338.1] cytotoxic and 
regulator}' T cell molecule; class I-restricted T 
cell-associated molecule [Mus musculus] 


HG 1 000648N0_1 60000_gene_predictio 
nl 


gi]20900199]reflXP_l 28639.1] RIKEN cDNA 
2810055C19 [Mus musculus] 


HG1000688N0 160000 gene predictio 


gi|26327707|dbilBAC27597.1] unnamed protein 
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nl 


product [Mus musculus] 


HG1000696N0 160000 gene_predictio 
nl 


gi|3599320|gb| AAC72793.il 0RF2 {Mus 
tnusculus domesticus] 


HG1000788N0 160000_gene_predictio 
nl 


p.|2Uo4/yi/|rei|Ar_l'WDlW.l 1 simiidr lu 
iaAA1904 protein [Homo sapiens] [Mus 
musculus] 


HG1000874N0_160000_gene_predictio 
nl 


.^lOAI/IO 1 '7^^1i-,a-fl"V"D 1 1 n^On 11 cimilar 

5i|2Ui4zl /D|rei|Ar_i iU4yu.i| suniiar lo 
lypothetical protein MGC955 [Homo sapiens] 
Mus musculus] 


HG1000902N0 20000_^ene_prediction 
1 


^|6753324|ref|NP_033968.1| chaperonin 
subunit 6a (zeta); chaperonin containing TCP-1 
"Mus musculus] 


HG 1 000902N0_1 60000_gene_predictio 


gi|6753324|ref|NP 033968.1] chaperonin 
subunit 6a (zeta); chaperonin containing TCP-1 
[Mus musculus] 


HG1G00902N0 1000_gene_predictionl 


gi|6753324|reflNP_033968. 1 1 chaperonin 
subunit 6a (zeta); chaperonin containing TCP-1 
[Mus musculus] 


HG1000904N0 160000 gene prcdictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000966NO 1000_gene_predictionl 


gi|221 226 1 7|reflNP_6662 1 5 . 1 1 hypothetical 
protein MGC255 1 1 [Mus musculus] 


HG1000966N0 5000_gene_predictionl 


gi|22122617|ref|NP_666215.1| hypothetical 
protein MGC2551 1 [Mus musculus] 


nl 


gi|12855175|dbj|BAB30238.1| unnamed protein 
product [Mus musculus] 


n3 


gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HG1001041N0_5000_gene_predictionl 


gi|25071304|reflXP 146497.3] similar to 
protein serine kinase Pskhl [Mus musculus] 


HG1001337N0 160000 gene_predictio 
nl 


gi|27369704|reflNP 766096.1] hypothetical 
protein 6030499008 [Mus musculus] 


HG1001417N0_5000_gene_predictionl 


gi]26349767]dbj]BAC38523.1| unnamed protein 
product [Mus musculus] 


HG1001485N0 160000 gene predictio 
nl 


gi|7513636|pir]]T30805 duttl protein - mouse 


HG1000151N0 160000 gene_predictio 
nl 


gi|18044328|gb|AAH19573.1] Unknown 
(protein for IMAGE:3990036) [Mus musculus] 


HG 1 000330N0_1 60000_gene_predictio 
n3 


gi|25029811MXP_207217.1] similar to 0RF2 
[Mus musculus domesticus] 


HG1000957N0 20000 gene_prediction 
1 


gi|25024769]reflXP_207136.1] similar to 0RF2 
[Mus musculus domesticus] 


HG1000960N0 O_gene_predictionl 


gi|20908689]ref]XP 127449.1] RIKENcDNA 
4632401C08 [Mus musculus] 
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HG1000960N0 0_gene_prediction2 


gi|20908689|reflXP_l 27449.1] RIKEN cDNA 
4632401C08 \Mus musculus] 


HG1001280N0 20000_^ene_prediction 
1 


gi|26336763|dbj|BAC32064.1| unnamed protein 
product [Mus musculus] 


HG1001502N0 160000 gene_predictio 
nl 


gi|27370240|reflNP_76641 5.1] hypothetical 
protein 4732490P 1 8 [Mus musculus] 


HG1000003N0 10000 gene_prediction 
1 , , ~ 


gi| 13624305|ref|NP_l 12440. 1 1 procollagen, 
type II, alpha 1 [Mus musculus] 


HG 1 00004 1N0_ 1 60000_gene_predictio 
nl 


gi|26390169|dbj|BAC25854.1| unnamed protein 
product [Mus musculus] 


HGl 000043N0_1 60000_gene_predictio 
n2 


gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HGl 000044N0_5000_gene_predictionl 


gill5079309|gb|AAHl 1494.11 Similar to 
Myosin of the dilute-myosin-V family [Mus 
musculus] 


HGl 00005 INO 160000 gene_predictio 
nl 


gi|14250190|gb|AAH085 15.1] interferon 
regulatory factor 6 [Mus musculus] 


HGl 00005 7N0 160000_gene_predictio 
nl 


gi|6755040|reflNP_035202.1| profilin 1; actin 
binding protein [Mus musculus] 


HG1000060N0 160000 gene_predictio 
nl 


gi|6755901|ref|NP_035783.1| tubulin, alpha 1; 
tubulin alpha 1 [Mus musculus] 


HGl 00006 INO 10000 gene_prediction 
1 • 


gi|20827552|reflXP_130234.1| expressed 
sequence AW6 10751 [Mus musculus] 


HG1000079N0 160000_gene_predictio 
nl 


gi|20887309|ref|XP_l 29200.1] adenylate kinase 
3 alpha like [Mus musculus] 


HG1000098N0 160000 gene_predictio 
nl 


gi]26340666]dbj]BAC33995.1] unnamed protein 

product [Mus musculus] 


HG 1 000 1 05N0_5000_gene_prediction 1 


gi|12850600]dbj|BAB28785.1] unnamed protein 

product [Mus musculus] 


HG1000121N0 160000_gene_predictio 
nl 


gi|26346402|dbj|BAC36852.1] unnamed protein 

product [Mus musculus] 


HG1000131N0 160000 gene_predictio 
nl ~ - ~ 


gi]263291 83|dbj]BAC28330. 1 ] unnamed protein 
product [Mus musculus] 


HG1000134N0 160000 gene_predictio 
nl 


gi] 1 2860377]dbj ]BAB3 1 934. 1 ] unnamed protein 
product [Mus musculus] 


HGl 0001 34N0 160000 gene_predictio 
n2 


gi] 1 2860377]dbj]BAB3 1 934. 1 ] unnamed protein 
product [Mus musculus] 


HG1000136N0_160000_gene_predictio 
nl 


gi]263895 1 9]dbj]BAC25745.1] unnamed protein 
product [Mus musculus] 


HG 1 000 1 47N0_1 60000_gene_predictio 
nl 


gi]37179781emblCAA73041.1] 5S ribosomal 
protein [Mus musculus] 


HG 1 000 1 66N0_1 60000_ge!ne_predictio 
nl 


gi]20908717]reflXP_127445.1] similar to 
flavoprotein subunit of succinate-ubiquinone 
reductase {Ratios norvegicus] [Mus musculus] 
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HG1000172N0 1000_gene_predictionl 


gi|6681095|reflNP_031834.1| cytochrome c, 

somatic [Mus musculus] 


HGl 000172NO_1 000_gene_preciictioii2 


gi|6681095|reflNP_031834.1| cytochrome c, 
somatic [Mus musculus] 


HG1000175N0 5000 gene_predictionl 


gi|26354216|dbj|BAC40736. 1 1 unnamed protein 
product [Mus musculus] 


HG1000175N0 10000 gene_prediction 
1 


gi|26354216|dbj|BAC40736.1| unnamed protein 
product [Mus musculus] 


HGl 0001 75N0_1 60000_gene_predictio 
nl 


gi|26354216|dbj|BAC40736.1| unnamed protein 
jroduct [Mus musculus] 


HG1000175N0 1000 gene_predictionl 


gi|26354216|dbj|BAC40736.1| unnamed protein 
Droduct [Mus musculus] 


HG1000192N0 160000 gene_predictio 
nl 


gi|10946614|reflNP_067287.1| WD repeat 
domain 12; nuclear protein Ytml [Mus 
musculus] 


HG1000193N0 160000_gene_predictio 
n2 


gi|21728370|ref|NP_080178.1| RIKEN cDNA 
1500009M05 [Mus musculus] 


HG1000195N0 160000_gene_predictio 
nl 


gi|17390530|gb|AAH18231.1| Unknown 
(protein for MGC; 19236) [Mus musculus] 


HG1000197N0 160000 gene_piedictio 
nl 


gi|214501 85|ref|NP_659063.1| hypothetical 
protein MGC28186 [Mus musciius] 


HG1000202N0 20000 gene_prediction 
1 


gi|263 3 1946|dbj|B AC29703.il unnamed protein 
product [Mus musculus] 


HGl 0002 lONO 20000 gene_prediction 
1 


gi|17160840|gb|AAH17597.1| RIKEN cDNA 
5830401B18 gene [Mus musculus] 


HGl 0002 1 8N0_1 000_gene_predictlonl 


gi|6681015|reflNP_031789.1| cysteine rich 
intestinal protein [Mus musculus] 


HGl 0002 18N0 160000 gene_predictio 
nl 


gi|6681015|reflNP_031789.1| cysteine rich 
intestinal protein [Mus musculus] 


HG1000218N0 10000_gene_prediction 
1 


gi|6681 01 5|reflNP_03 1789.1] cysteine rich 
intestinal protein [Mus musculus] 


HGl 000222N0_1 000_gene_predictionl 


gi|13385054|reflNP_079873.1| RIKEN cDNA 
2700033116 [Mus musculus] 


HG1000233N0_1000_gene_predictionl 


gi|l 2847362|dbj|BAB27541 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000234N0 1000_gene_predicticMil 


gi| 1 2847362|dbj|BAB27541 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000234N0 160000_gene_predictio 
nl 


product [Mus musculus] 


KG 1 00023 8N0_1 60000_gene_predictio 
n2 


gi|667 1 549|ref|NP_03 1479. 1 1 anti-oxidant 
protein 2; acidic calcium-independent 
phospholipase A2; p.eroxiredoxin 5; 1-Cys Prx 
[Mus musculus] 


|hG1000240N0 160000 gene predictio 


gi|26328673|dbi|BAC28075.1 1 unnamed protein 
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nl 


product [Mus musculus] 


HG1000245N0 160000 gene_predictio 
nl 


gi|12850132|dbj|BAB28604.1| unnamed protein 
product [Mus musculus] 


HG1000245N0 5000_gene_piedictionl 


gi| 1 2850 1 32|dbj|BAB28604. 1 1 unnamed protein 
product [Mus musculus] 


HG1000249N0 10000 gene prediction 
1 


gi|6754654|ref|NP 034905. 1| mannose binding 
ectin, liver (A) [Mus musculus] 


nl 


gi|2088 1 9 1 3 IreflXP 1 262 1 1 . 1 1 Dullard 
lomolog [Mus musculus] 


HG1000252N0_5000_genejpredictionl 


gi|20825536|reflXP 129507.1] ring finger 
protein 2 [Mus musculus] 


HGl 00025 4N0 160000 gene predictio 
nl 


gi| 1 3385058|refl]S[P_079878. 1 1 hypothetical 
protein D10Ertd718e [Mus musculus] 


nl 


gi|21312163|reflNP 082683.1] RIBCEN cDNA 
2900054P12 [Mus musculus] 


HG 1 000264N0_1 OOOjgene jpredictionl 


gi|21624617]reflNP 081018.1] RIKEN cDNA 
1 1 10007M04 [Mus musculus] 


HG1000264N0_1000_genejprediction2 


gi|21 62461 7]reflNP 081018.1] RIKEN cDNA 
1 1 10007M04 [Mus musculus] 


XlVJ 1 UUUZ / UIN U ZUUWU ^CTIC pi CUl^LiUll 

1 


gi] 1 2844 1 96|dbj ]BAB26273 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000270N0 1000_gene_predictionl 


gi] 1 2852884|dbj |BAB29566. 1 ] unnamed protein 
product [Mus musculus] 


HG1000274N0 160000 _gene_predictio 
nl 


gi|26347831]dbj|BAC37564.1] uimamed protein 
product [Mus musculus] 


HGl 000276N0_1 60000_genejpredictio 


gi|iy3272z6|rei|JNr_5yo /o5.i| JJINA ssgmenx, 
CItt 10, ERATO Doi 214, expressed [Mus 
musculus] 


HGl 000276N0_5000_gene_predictionl 


gi]19527228]reflNP_598768.1] DNA segment, 
Chr 10, ERATO Doi 214, expressed [Mus 
musculus] 


HG1000278N0 5000 gene_predictionl 


gi]19527026]reflNP 598568.1] expressed 
sequence AA959742 [Mus musculus] 


nl 


gi|7106337|ref]NP 034796.1] keratin complex- 
1, gene C29 [Mus musculus] 


HG1000280N0_1000_gene_predictionl 


gi|7106337|reflNP 034796.1] keratin complex- 
1, gene C29 [Mus musculus] 


HG1000280N0 160000_gene_predictio 
n2 


gi]7106337]reflNP_034796.1] keratin complex- 
1, gene C29 [Mus musculus] 


HG1000280N0_1000_genejprediction2 


gi|7106337]reflNP_034796.1] keratin complex- 
1, gene C29 [Mus musculus] 


HG1000305N0 5000_genejpredictionl 


gi|27369902|ref|NP_76621 8. 1 1 hypothetical 
protein A530095G11 [Mus musculus] 
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HGl 000305N0_5000_gene_prediction2 


gi|27369902|reflNP_7662 1 8. 1 1 hypothetical 
3roteinA530095Gll [Mus musculus] 


HG1000307N0 160000_gene_predictio 
nl 


rvJIomiQ^llr-aflXTP M'JSAlid 11 nnHiY rnnrlpnsiirlp 

11 ojyooj J rei|rNx uj ooih-.i| xiuuia ^^ij.uL'icu&iuc 
diphosphate linked moiety X)-type motif 5 
Mus musculus] 


HGl 00033 4N0_1 60000_gene_predictio 
nl 


/vi'lonooC^^1lfofl"VT> 11 oimi'lar+r» 

Jl|ZUoooj jj|rei|jvr_i j'tojZ.1| miuusu wj 
Probable serine/threonine protein kinase 
SNFILK [Mus musculus] 


HG1000335N0 160000_gene_predictio 
nl 


gi|20888553|reflXP_l 34832.11 similar to 
Probable serine/threonine protein kinase 
SNFILK [Mus musculus] 


HG 1 0003 3 7N0_5 O00_gene_predictionl 


gi| 1 28 5 1 9 1 8 |dbj |BAB 29207 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000343N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HGl 000343N0 1 60000_^enejpredictio 
n2 


gi|13386340|reflNP_083008.1| RIKEN cDNA 
4632428N05 [Mus musculus] 


HG 1 0003 69N0 1 60000_^ene jpredictio 
nl 


gi|12837873|dbj|BAB23982.1| unnamed protein 
product [Mus musculus] 


HG1000372N0 160000 gene predictio 
nl 


gi|20913947|reflXP_126555.1| RIKEN cDNA 
1190006K01 [Mus musculus] 


HG1000378N0 160000 genejpredictio 
nl 


gi|26348995|dbj|BAC38137.1| unnamed protein 
product [Mus musculus] 


HGl 0003 87N0 160000 gene predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 0003 87N0 160000_gene_piedictio 
n2 


gi|26382861|dbj|BAC255 10.1 1 unnamed protein 
product [Mus musculus] 


HGl 000397N0_5000_genejpredictionl 


gi|20836469|reflXP_l 297 1 7. 1 1 hypothetical 
protein XP 129717 [Mus musculus] 


HG1000408N0 160000 gene_predictio 
nl 




HGl 00041 4N0_1 60000_gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 00043 1N0_1 60000_gene_predictio 
nl 


gi|8394057|reflNP_058565.1| low density 
lipoprotein receptor-related protein 4; low 
density lipoprotein-related protein 4; Low 
Density Lipoprotein Receptor Related Protein 
4; corin [Mus musculus] 


HGl 00043 9N0 160000 gene_predictio 
nl 


gi|12851918|dbj|BAB29207.1| unnamed protein 
product [Mus musculus] 


HG1000449N0_20000_genejprediction 


gi|250251 17|reflXP_207206.1| similar to 
transcription factor-like nuclear regulator; 
putative transcription regulation nuclear 
protein; putative transcription fector-like 
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nuclear regulator; TATA box binding protein 
(TBP)-associated factor, RNA polymerase III, 
GTF3B subunit 1; ... [Mus musculus] 


HGl 00045 7N0 160000 genejpredictio 
nl .> ~ .. ~ 


gi|20824761 |reflXP_l 33346. 1 1 liver-specific 
jHLH-Zip transcription factor [Mus musculus] 


HGl 00045 8N0 160000 gene_predictio 
nl 


gi|12841242|dbj|BAB25129.1| unnamed protein 
product [Mus musculus] 


HGl 000461 NO 160000 gene_predictio 
nl ' ~ 


gi|25032310|reflXP 205729.1] hypothetical 
protein XP_205729 [Mus musculus] 


VSr^'i (\(\f\d^'X\\SC\' 'i f^000() apne nredictio 

Xl Vj 1 V/vUt-O J iSV 1 \jW\J\J ^PJ-XP \Jl CU-lk/inj 

nl 


gi|12861068|dbj|BAB32114.1| unnamed protein 
product [Mus musculus] 


HG1000463N0 160000 genejpredictio 
n2 


gi|13249351 |reflNP_076402.1| inositol- 
requiring 1 alpha (yeast) [Mus musculus] 


Mril nnnA7^%>Jn l^^nnnO apr»p nredictio 
rlVjf 1 UV/V/T' / DiN U 1 OV/V/V/ V/_gvJ,iC_J[Jl CUi^/ iLKj 

nl 


gi|26332657|dbj|BAC30046. 1 1 uimamed protein 
product [Mus musculus] 


HGl 00048 1N0_1 60000 _genejpredictio 


gi|21311873|reflNP_077181.1| RIKEN cDNA 
061 0007 A03 [Mus musculus] 


HG1000530N0_160000 _genejpredictio 
nl 


gi|20860491 jreflXP_l 53755.1] hypothetical 
protein XP 153755 [Mus musculus] 


n2 


gi|25031497|reflXP_207552.1] similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 


HG1000584NO 160000 sens predictio 
nl 


gi]27370500]reflNP 766581.1] hypothetical 
protein D230008H22 [Mus musculus] 


HG1000587N0 160000 gene__predictio 
nl 


gi|23682449|reflXP 158842.2| hypothetical 
protein XP_1 58842 [Mus musculus] 


HG1000592N0 160000 gene_predictio 
nl 


gi]26349599]dbj]BAC38439.11 unnamed protein 
product [Mus musculus] 


HG 1 000594N0_1 60000_gene jpredictio 
nl 


gi|22095015|ref|NP_084065.1] RIKEN cDNA 
0610013117 [Mus musculus] 


HGl 000594N0_1 60000 _gene jpredictio 
n2 


gi|22095015|reflNP_084065.1| RIKEN cDNA 
0610013117 [Mus musculus] 


HGl 000608N0_1 60000_gene jpredictio 


gi]20345223]reflXP_l 09778.1] similar to 
Neurabin-II (Neural tissue-specific F-actin 
binding protein II) (Protein phosphatase 1 
regulatory subunit 9B) (Spinophilin) (pi 30) 
(PPlbpl34) [Mus musculus] 


HG1000615N0 160000_gene jpredictio 
nl 


gi|7710032|reflNP_057928.1] growth factor 
receptor bound protein 14 [Mus musculus] 


HG1000620N0 160000_gene jpredictio 
nl 


gi|25052462]reflXP 138105.3| similar to TAR 
DNA-bindingprotein-43 (TDP-43) [Mus 
musculus] 


HG1000621N0 160000 genejpredictio 
nl 


gi|3599320]gb]AAC72793.1] 0RF2 [Mus 
musculus domesticus] 
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HGl 00062 INO 160000 genejpredictio 
n3 


5i|26382861|dbj|BAC25510.1| unnamed protein 
product [Mus musculus] 


HGl 00063 lN0_40000_gene_prediction 
1 


Ti|6681283|reflNP_031938.1| epidermal growth 

viral (v-erb-b) oncogene homolog [Mus 
musculus] 


HG1000652N0 160000 gene_predictio 
nl 


gi|25030122|ieflXP_207332.1| similar to 
endonuclease/reverse transcriptase [Mus 
musculus] 


HG1000663N0 160000_gene_predictio 
nl 


gi|20915416|reflXP_162987.1| hypothetical 
protein XP_1 62987 [Mus musculus] 


HG1000686N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000700N0 160000_genejpredictio 
nl 


gi|16508047|gb|AAL17972.1| pORF2 [Mus 
musculus domesticus] 


HG 1 00070 INO 1 60000_gene_predictio 
nl 


gi|26327l67|dbj|BAC27327.1| unnamed protein 
product [Mus musculus] 


HG1000709N0_1 60000_gene_predictio 
nl 


gi|220579|dbj|BAA00448.1| open reading 
frame (196 AA) [Mus musculus] 


HG1000712N0_160000_gene_predictio 
nl 


gi|12841826|dbj|BAB25366.1| unnamed protein 
product [Mus musculus] 


HG1000720N0 160000_gene_predictio 
nl 


gi|7657415|ref|NP_035986.2| odd Oz/ten-m 
homolog 2 (Drosophila); odd Oz/ten-m 
homolog 3 (Drosophila) [Mus musculus] 


HG1000727N0 160000 genejpredictio 
nl 


gi|26335645 |dbj |BAC3 1 523 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000743N0 160000_genejpredictio 

n2 


gi|26338834|dbj|BAC33088.1| unnamed protein 
product [Mus musculus] 


HG1000767N0_5000 _gene_predictionl 


gi|12851918|dbj|BAB29207.1| unnamed protein 
product [Mus musculus] 


HG1000786N0_1 60000 _^enejpredictio 
n2 


rril^^^i7S'^n'^lrpfnsrP O'^'^'^Rfi 1 1 transirrintinn 

gl DO /o.jU^|rci|INr UJJJOU.i] UcUioOiipiiLiAj. 

factor A, mitochondrial [Mus musculus] 


HGl 000822N0_1 60000_gene_predictio 
nl 


gi|6680195|reflNP_032255.1| histone 

State University 179, expressed [Mus 
musculus] 


HGl 000829N0 1 60000_genejpredictio 
nl 


gi|21450159|reflNP_659049.1| cDNA sequence 
BC024131; hypothetical protein MGC37896 
[Mus musculus] 


HGl 000848N0_1 60000_gene_predictio 
nl 


gi|26350995|dbj|BAC39134.1| unnamed protein 
product [Mus musculus] 


HG1000860N0 160000_genejpredictio 
nl 


gi|26325678|dbj|BAC26593.1| unnamed protein 
product [Mus musculus] 


HG1000898N0 10000 gene prediction 


gil214502091reflNP 659075.11 hypothetical 
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1 


protein MGC25509 [Mus musculus] 


HG1000898N0 160000_getie_predictio 
nl 


gi|21 450209|reflNP_659075 . 1 1 hypothetical 
protein MGC25509 [Mus musculus] 


HG 1 000898N0_20000_gene_prediction 

1 .1 


gi|21 450209|reflNP_659075 . 1 1 hypothetical 
protein MGC25509 [Mus musculus] 


HGl 000902N0_1 60000_genejpredictio 
nl 


gi|21 450209|reflNP_659075 . 1 1 hypothetical 
jrotein MGC25509 [Mus musculus] 


HG1000904N0 160000 genejpredictio 
n3 ~ 


gi|6753324|reflNP_033968.1|chaperonin 
subunit 6a (zeta); chaperonin containing TCP-1 
'Mus musculus] 


HG1000906N0 20000 genejprediction 
1 


gi|20344324|f eflXP_l 09683 . 1 1 RIKEN cDNA . 
1 8 1 002701 0 [Mus musculus] 


HG1000906N0 160000 gene_predictio 
nl 


gi|26346 1 1 4|dbj IBAC36708. 1 1 unnamed protein 
product [Mus musculus] 


HGl 000921N0_5000_gene_predictionl 


gi|263461 14|dbj|BAC36708. 1 1 unnamed protein 
product [Mus musculus] 


HGl 00093 8N0 1 0000 gene prediction 
1 


gi|26350775|dbj|BAC39024. 1 1 unnamed protein 
product [Mus musculus] 


HG 1 00095 2N0_1 60000_genejpredictio 


gi|26339054|dbj|BAC33198.1| unnamed protein 
product [Mus musculus] 


HG 1 00096 INO 1 60000_gene jpredictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000961N0_160000_genejpredictio 
n2 


gl|/jUJl/0 /|rcl|A.r_ltOOUJ.J| hliiiuai lu 

KIAA0877 protein [Homo s^iens] [Mus 
musculus] 


HGIOOIOOONO 160000 gene_predictio 
n2 


gi|20859143|reflXP_127126.1| similar to 
eukaryotic initiation factor 5 [Rattus 
non'egicus] [Mus musculus] 


HG1001003N0 160000 gene_predictio 
nl 


gi| 19527072|reflNP_59861 3 . 1 1 expressed 
sequence AW555139 [Mus musculus] 


HG1001007N0 160000 gene_predictio 

nl 


gi|13277825|gb|AAH03796.1| Similar to 
lymphocyte specific 1 [Mus musculus] 


HG1001009N0 0 gene predictionl 


gil263 34641 |dbj IBAC3 1 02 1 . 1 1 unnamed protein 
product [Mus musculus] 


HG1001014N0 160000 gene_prcdictio 
n2 


gi|26329567|dbj|BAC28522.1| unnamed protein 
product [Mus musculus] 


HG1001017N0 40000 gene_prediction 
1 


gi|26337385|dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HGl 001 017N0_20000_gene_prediction 
1 


gi|25019831 |ref|XP_207463.1 1 similar to 
CD59B [Mus musculus] 


HG1001144N0 160000_gene_predictio 
nl 


gi|25019831|ref|XP_207463.1| similar to 
CD59B [Mus musculus] 


HG1001172N0 160000_gene_predictio 
n2 


gi|3599320|gb|AAC72793.1| ORP2 [Mus 
musculus domesticus] 
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HG1001214N0 20000 gene_prediction 
1 


gi|26340706|dbj|BAC34015.1| unnamed protein 
product [Mus musculus] 


HG1001229N0 160000_gene_predictio 
nl 




HG1001253N0 160000_gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1001253N0 160000 gene_predictio 
n2 


gi|2632625 l|dbj|BAC26869. 1 1 unnamed protein 
product [Mus musculus] 


HG106l267N0 160000 gene_predictio 
nl 


gi|2632625 1 |dbj |BAC26869. 1 ] unnamed protein 
product [Mus musculus] 


HG1001289N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl GDI 343N0_1 0000_gene_prediction 


§i|26333317|dbj|BAC30376.1| unnamed protein 
product [Mus musculus] 


HG1001343N0 160000 gene_predictio 
nl 


gi|6755060|ref|NP_035214. 1 1 
phosphatidylinositol 3-kinase, C2 domain 
containing, gamma polypeptide [Mus 
musculus] 


HG1001390N0 160000_gene_predictio 
nl 


gi|6755060|reflNP_035214.1 1 
phosphatidylinositol 3-kinase, C2 domain 
containing, gamma polypeptide [Mus 
musculus] 


HG1001468N0_160000 _gene_predictio 
nl 


gi|6680083|reflNP_0321 89.11 growth factor 
receptor bo\md protein 2 [Mus musculus] 


HG1001508N0 160000_gene_pi'edictio 

n2 


gi|25030495|reflXP_205178.1| similar to 
bA130N24.1 (novel protein similar to REV3L 
(REV3 (yeast homolog)-like, catalytic subunit 
of DNA polymerase zeta) (POLZ)) [Homo 
sapiens] [Mus musculus] 


HG1000084N0 160000 gene_predictio 
nl 


gi|26382861|dbj|BAC25510.1| uraiamed protein 
product [Mus musculus] 


HGl 000084N0_160000_gene_predictio 
n2 


gi|2503 1 822|reflXP_20774 1 . 1 1 hypothetical 
protein XP 207741 [Mus musculus] 


HG1000209N0 160000_gene_predictio 
nl 


gi|2503 1 822|ref|XP_207741 . 1 1 hypothetical 
protein XP_207741 [Mus musculus] 


HG1000382N0 160000_gene_predictio 
nl 


gi|20858167MXP_125585.1| similar to 
PTD013 protein; CGI-24 protein [Mus 
musculus] 


HG1000591N0_160000 _gene_predictio 
nl 


gi|6678716|ref|NP 032539.1] low density 
lipoprotein receptor-related protein 5; low 
density lipoprotein-related protein 5 [Mus 
musculus] 


HG1000904N0_1 60000_gene_predictio 
n4 


gi|26330005|dbj|BAC28741.1| unnamed protein 
product [Mus musculus] 
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HG1000005N0_160000 _gene_predictio 


gi|20835832|reflXP_l 29684.1] complement 
receptor 2 [Mus musculus] 


HG1000014N0_160000 _^ene_predictio 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


nl 


gi|6680744|ref|NP_031528.1| ATPase, Na+/K+ 
xansporting, beta 3 polypeptide; ATPase, 
Na+/K+ beta 3 polypeptide [Mus musculus] 


HG1000015N0 20000 jgene_prediction 


gi|20467423|reflNP 620570.1] chondroitin 
sulfate proteoglycan 4 [Mus musculus] 


HG1000015N0 5000_^gene_predictionl 


gi]20467423|ref|NP 620570.1] chondroitin 
sulfate proteoglycan 4 [Mus musculus] 


HG1000015N0 160000 gene_predictio 
n2 


gi|20467423|reflNP 620570.1] chondroitin 
sulfate proteoglycan 4 [Mus musculus] 


HGl 000020N0_1 60000_gene_predictio 


gi]20467423]reflNP_620570. 1 ] chondroitin 
sulfate proteoglycan 4 [Mus musculus] 


HG1000020N0 5000 'gene_prediction2 


gil26330706]dbj|BAC29083.11 uimamed protein 
product [Mus musculus] 


HGl 000024N0 1 0000_gene_prediction 
1 


gi|20887101]reflXP_129228.1] sunilar to 
phosphoglucomutase 5 [Homo sapiens] [Mus 
musculus] 


HGl 000026N0_1 60000_gene_predictio 
nl 


gill28537861dbjlBAB29848.1| unnamed protein 
product [Mus musculus] 


HGl 00003 ONO 160000 gene predictio 
nl 


gil95063671reflNP_062425.1 ] ATP-binding 
cassette, sub-family B, member 10; ATP- 
binding cassette, sub-family B (MDR/TAP), 
member 12; Abc-mitochondrial erythroid [Mus 
musculus] 


HG1000039N0 160000 gene predictio 
nl 


gil260062031dbj 1BAC41 444. 1 1 mKIAA0696 
protein [Mus musculus] 


HG1000041N0_5000_gene_predictionl 


gi]7106453|reflNP 035897.1] zinc finger RNA 
binding protein [Mus musculus] 


HG1000043N0 160000 gene_predictio 
nl 


gi|26390169]dbj]BAC25854.1] unnamed protein 
product [Mus musculus] 


HG1000043N0 5000 gene prediction 1 


gi]26337385|dbj]BAC32378.1] unnamed protein 
product [Mus musculus] 


HGl 000044N0_20000_gene_prediction 


gi]26337385]dbj|BAC32378.1| unnamed protein 
product [Mus musculus] 


HG1000052N0 160000_gene_predictio 
n2 


gi] 15079309]gb|AAHl 1494.1 j Similar to 
Myosin of the dilute-myosin-V family [Mus 
musculus] 


HG1000052N0_10000 ^ene_prediction 
1 


gi]26324852|dbj]BAC26180.1| unnamed protein 
product [Mus musculus] 


HG1000052N0 20000_gene_prediction 
1 


gi|26324852|dbj|BAC26180.1] unnamed protein 
product [Mus musculus] 
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HGl 000058N0_1 0000_gene_prediction 


gi|26324852|dbj|BAC26180.1| unnamed protein 
)roduct [Mus musculus] 


HG1000061N0 5000_gene_predictionl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 000065N0_5000_gene_predictionl 


gi|503 1 57 1 |reflNP_0057 1 3 . 1 1 actin-related 
jrotein 2; ARP2 (actin-related protein 2, yeast) 
lomolog [Homo sapiens] 


W01 nnnOfi^'Krn 1 noon o-pne -nrediction 
1 


gi|13386220|reflNP_081610.1| RIBCEN cDNA 
221D414H16 [Mus musculus] 


JlvJl UUUWO jINU lU\/Uv/\/_^CllC_piCUJ.Vti*_» 

nl 


gi|13386220|rcflNP 081610.1] RIKEN cDNA 
2210414H16 [Mus musculus] 


JlVjr 1 UUUUu olN U 1 OV/V/U V gCllC_JJl CUi V iiKJ 

nl 


gi|13386220|reflNP 081610.1| RIKEN cDNA 
2210414H16 [Mus musculus] 


HG1000070N0_0_gene_predictionl 


gi|26326191|dbj|BAC26839.1| unnamed protein 
product [Mus musculus] 


1 


gi|21595527|gb|AAH32275.1| Similar to 
receptor-like tyrosine kinase [Mus musculus] 


HGl 000075N0_1 60000_gene_predictio 


gi|26326407|dbj|BAC26947.1| unnamed protein 
product [Mus musculus] 


HGl 000076N0_1 60000_gene_predictio 
nl 


gi|3599320|gb| AAC72793.il 0RF2 [Mus 
musculus domesticus] 


HGl 00008 INO 160000_gene_predictio 
nl 


gi|4502549|reflNP_001 734.1] calmodulin 2 
(phosphorylase kinase, delta); phosphorylase 
kinase delta [Homo sapiens] 


HG1000106N0 160000 gene_predictio 
nl 


gi|6680305|ref|NP_032328.1| heat shock 
protein, 84 kDa 1 [Mus musculus] 


HG1000107N0_160000_gene_predictio 


gi|668 1 225|reflNP_03 1 905'. 1 1 developmentally 
regulated GTP binding protein 1 ', 
developmentally regulated GTP-binding 
protein 1 [Mus musculus] 


HGl 0001 09N0_0_gene_predictionl 


gi|6754774|reflNP_034986.1| myosin heavy 
chain, cardiac muscle, adult; alpha cardiac 
MHC; alpha myosin [Mus musculus] 


f>nf>1 1 OXrn l^nnnn opr>p -nre-AirUn 
xlvj i UuU 1 1 ZIN \J 1 OvuVJV/ ^C11C__|J1CU1L'LHJ 

nl 


gi|23956080|reflNP 058675.1] putative 
serine/threonine kinase [Mus musculus] 


HGlOOOl 16N0_160000_gene_predictio 


gi|3599320|gb]AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 000 1 26N0_1 60000_gene_predictio 
nl 


gi|6680305|ref|NP_032328.1| heat shock 
protein, 84 kDa 1 [Mus musculus] 


HG1000130N0 160000_gene_predictio 
nl 


gi|20825377|reflXP_l 43696.1] similar to 
hypothetical protein dJ12208.2 [Homo 
sapiens] [Mus musculus] 


HGl 000 1 32N0_1 60000_gene_predictio 
nl 


gi|6754208|ref]NP_034569.1] high mobility 
group box 1; high mobility group protein 1 
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[Mus musculus] 


HG1000133N0 160000 gene_predictio 
nl 


gi|26347765|dbj|BAC3753 1 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000134N0 20000 .gene_prediction 
1 


gi|26382599|dbj|BAB22733.2| unnamed protein 
product [Mus musculus] 


HG1000134N0 20000 gene_prediction 
2 


gi|26353738|dbj|BAC40499. 1 1 unnamed protein 
product [Mus musculus] 


HG1000142N0 160000 gene_predictio 
nl 


gi|26353738|dbj|BAC40499.1| unnamed protein 
product [Mus niusculus] 


HG1000144N0 ?0000_^ene_prediction 
1 


gi|6679108|ref|NP_032748.1| nucleophosmiii 1; 
nucleolar protein N038 [Mus musculus] 


HGl 000145N0_1 60000_gene_predictio 
nl 


gi|oo/ / / /y|rei|rsr_U3jiu/.i| noosomai protem 
L28; DNA segment, Chr 7, Wayne State 
University 21, expressed [Mus musculus] 


HG1000146N0_160000_gene_predictio 
nl 


gi|6677779|ref|NP_033 107.1] ribosomal protein 
L28; DNA segment, Chr 7, Wayne State 
University 21, expressed [Mus musculus] 


HG1000150N0 10000 gene_prediction 
1 


gi|3717978|emb|CAA73041.1| 5S ribosonial 
protein [Mus musculus] 


riGiuuuijzJNO louuuu gene_preciictio 
nl 


gi|11037798|ref|NP_067621.1| dynactin 5; 
dynactin 4; p25 dynactin subunit [Mus 
musculus] 


HG1000161NG 160000 gene_predictio 
nl 


gi|zi jj0Z't'Z|rei|iNr j/j4yy.i| giucoconicoio 
induced transcript 1; testhymin; 
thymocyte/spermatocyte selection 1 [Mus 
musculus] 


HG1000163N0 160000 gene_predictio 
nl 


gi|2081 9730|reflXP_l 29359. 1 1 hypothetical 
protein XP_129359 [Mus musculus] 


HGl 0001 64N0_5000_gene_predictionl 


gi|20835770|reflXP 132127.1] similar to 60S 
RIBOSOMAl. PROTEIN L13 [Mus musculus] 


HG 1 0001 65N0_1 000_gene_predictionl 


gi|263404481dbj|BAC33887.1| unnamed protein 
product [Mus musculus] 


HG1000166N0 160000 gene_predictio 
n2 


gi|26353666]dbj|BAC40463.1| unnamed protein 
product [Mus musculus] 


HG1000167N0 160000_gene_predictio 
nl 


gi]273 69878 |reflNP_766203 . 1 ] hypothetical 
protein 5330403K09 [Mus musculus] 


HGl 0001 7 lN0_40000_gene_prediction 


gi|26354683|dbj]BAC40968.1] unnamed protein 
product [Mus musculus] 


HG1000171N0 160000 gene_predictio 
nl 


gi|26325838|dbj]BAC26673.1] unnamed protein 
product [Mus musculus] 


HGl 0001 75N0 160000 gene_predictio 
n2 


gi]26325838|dbj]BAC26673.1| unnamed protein 
product [Mus musculus] 


HGl 000176N0_1 000_gene_predictionl 


gi|26354216]dbj|BAC40736. 1| unnamed protein 
product [Mus musculus] 
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HG1000176N0 160000 gene_predictio 
nl 


gi|26337635|dbj|BAC32503.1| unnamed protein 
product [Mus musculus] 


HG1000177N0 160000_gene_predictio 
nl 


gi|26337635|dbj|BAC32503.1| unnamed protein 
product [Mus musculus] 


HG1000178N0 160000 gene_predictio 
nl 


gi|20884040|reflXP_l 3473 1 . 1 1 endothelial 
differentiation, sphingolipid G-protein-coupled 
receptor, 5 [Mus musculus] 


HG1000178N0 160000_gene_predictio 
ni 


gi|13384830|reflNP 079706.1| RIKEN cDNA 
1 1 10066C01 [Mus musculus] 


HG1000180N0_1000_gene_predictlonl 


gi|13384830MNP_079706.1| RIKEN cDNA 
111 0066C0 1 [Mus musculus] 


HG1000181N0 10000 gene_prediction 
1 


gi|13384730|reflNP_079640.1| RIKEN cDNA 
1 1 1 0005 A23 [Mus musculus] 


HG1000181N0 160000 gene_predictio 
nl 


gi|25023031|reflXP^205093.1| similar to 
hypothetical protein FLJ38281 [Homo sapiens] 
[Mus musculus] 


HG1000183N0 160000 gene_predictio 
nl 


gi|26334755|dbj|BAC31078.1| unnamed protein 
product [Mus musculus] 


HG1000186N0 20000_gene prediction 

1 


gi|27370150|reflNP 766364.1] hypothetical 
protein D630002G06 [Mus musculus] 


HG1000186N0 160000 gene__predictio 
n2 




HG1000187N0 20000_gene_prediction 
1 


gi|26342222|dbj|BAC34773,l| unnamed protein 
product [Mus musculus] 


HG1000187N0 160000 gene_predictio 
n3 




HGl 0001 89N0_1 000_gene_predictionl 


gi|25024769|ref]XP_207 136.11 similar to 0RF2 
[Mus musculus domesticus] 


HG1000189N0_5000_gene_predictionl 


gi|26325734|dbj|BAC26621.1| unnamed protein 
product [Mus musculus] 


HGl 000 1 89N0_1 000_gene_prediction2 


gi|20879992|reflXP_140210.1| similar to 
BG:DS01 759.1 gene product [Drosophila 
melanogaster] [Mus musculus] 


HGl 000 1 89N0_5000_gene_prediction2 


gi|26325734|dbj|BAC26621.1| unnamed protein 
product [Mus musculus] 


HGIOOO 195N0_1 0000_genejprediction 


gi|20879992|reflXP_l 40210.11 similar to 
BG:DS01 759.1 gene product [Drosophila 
melanogaster] [Mus musculus] 


HG1000199N0 160000 gene__predictio 
nl 


gi|17390530|gb|AAH18231.1| Unknown 
(protein for MGC: 19236) [Mus musculus] 


HG1000201N0 10000 gene_prediction 
1 


gi|20824845|reflXP_l 3 1 963. 1 1 expressed 
sequence C77020 [Mus musculus] 


HG1000203N0_5000_gene_predictionl 


gi|27477269|ref|XP_209223.11 similar to 
Transforming protein RhoC (E9) [Homo 
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sapiens] 


HG1000204N0 10000 gene_prediction 

1 ■ 


gi|26333233|dbj|BAC30334.1 1 unnamed protein 
product [Mus musculus] 


HG1000209N0 160000. gene_predictio 

n2 , ., 


gi|26326739|dbj|BAC271 13.11 unnamed protein 
product [Mus musculus] 


JtiLriuuu/i jinu_jUUU _gene_preaicuoiii 


gi|27369784|reflNP_766142.1| hypothetical 
protein A230053P19 [Mus musculus] 


HG1000215N0 1000 gene_predictionl 


gi|6671756|reflNP_031732.1| suppressor of 
cytokine signaling 2; cytokine inducible SH2- 
containing protein 2; high growth; STAT- 
induced STAT inhibitor 2; cytoldne-inducible 
SH2 protein 2 [Mus musculus] 


HG1000219N0_10000_gene_prediction 


gi|2632891 5|dbj|BAC28 196.11 unnamed protein 
product [Mus musculus] 


HG1000221N0 160000_gene_predictio 
nl 


gi|4504255|reflNP_002097.1| H2Ahistone 
family, member Z; H2AZ histone [Homo 
sapiens] 


HG1000221N0 20000 geiie_prediction 
1 


gi|11360345|pir||T42725 actin binding protein 
ACF7, neural isofonn 1 - mouse (fragment) 


HG1000223N0 160000 ge£ie_predictio 
nl 


gi| 1 1360345|pir||T42725 actm binding protein 
ACF7, neural isoform 1 - mouse (fragment) 


JtivjriuuuzzjJNU iouuuu_gene_preciictio 
nl 


gi|25019988|ref|XP_207469.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 


JnLLriuuu/iDJNU louuuu gene_preaictio 
nl 


gi|20137004|reflNP_035320.1|proteasome 
(prosome, macropain) 28 submiit, beta; 
protease (prosome, macropain) 28 subunit, beta 
[Mus musculus] 


HG1000236N0 160000 gene_predictio 
nl 


gi|l3ol /iy/|rei|J>IP_U//lJ3,l| AiPase, tH- 
transporting, lysosomal 13kD, VI subunit G 
isofonn 1 ; ATPase, H+ transporting, lysosomal 
(vacuolar proton piraip) [Mus musculus] 


rUjriuuuzooJNU 1 ouuuu_gene_preclictio 
nl 


gi|667 1 704 reflNP_03 1 664. 1 1 chaperonin 
subunit 7 (eta) [Mus musculus] 


HG1000238N0_5000_gene _predictionl 


gi|6671549|reflNP_031479.1|anti-oxidant 
protein 2; acidic calcium-independent 
phospholipase A2; peroxiredoxin 5; 1-Cys Prx 
[Mus musculus] 


HG1000239N0 160000 gene_predictio 
nl 


gi|667 1 549|reflNP_03 1 479. 1 1 atiti-oxidant 
protein 2j d.cidic C3lciuin.-iiiciepciid.6iit 
phospholipase A2; peroxiredoxin 5; 1-Cys Prx 
[Mus musculus] 


HG1000241N0 160000 gene__predictio 
nl 


gi|7657357|reflNP_056596.1 1 nucleosome 
assembly protein 1-like 1; nucleosome 
assembly protein- 1 [Mus musculus] 
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nl 


ji|4759158|ref|NP_004588.1| small nuclear 
ribonucleoprotein D2 polypeptide 16.5kDa; 
small nuclear ribonucleoprotein D2 polypeptide 
(16.5kD) [Homo sapiens] 


Wr^l nnn94'^>jn l^iOnnfl o-pup nrpHirflo 

n2 


gi|8393534|reflNP 058653. 1| high mobility 
group protein 17 [Mus musculus] 


HG 1 000245N0_1 000_gene_prediction 1 


gi|8393534|ref|NP 058653. 1| high mobility 
group protein 17 [Mus musculus] 


HG1000250N0 160000 gene_predictio 
nl 


gi|12850132|dbj|BAB28604.1| unnamed protein 
product [Mus musculus] 


HG1000252N0 160000_gene_predictio 
nl 


gi|20824845|ref|XP_131963.1| expressed 
sequence C77020 [Mus musculus] 


JtlVjrlUUUZjjlNU l\J\JKJV KCiic prcuidiuii 
1 


gi|17105394|reflNP_000975.2| ribosomal 
protein L23a; 60S ribosomal protein L23a; 
tnelanoma differentiation-associated gene 20 
[Homo sapiens] 


JnlvjrlUUUZDZiNU luuuuu ^cixc pxcun^uu 

n2 


gi|13385532|reflNP 080303.1 1 RIKEN cDNA 
2700086123 [Mus musculus] 


nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000264N0_5000_gene__predictionl 


gi|26360198|dbj|BAB25612.2| unnamed protein 
product [Mus musculus] 


HGl 000264N0_5000_gene_prediction2 


gi|21624617|reflNP 081018.1 1 RIKEN cDNA 
1110007M04 [Mus musculus] 


JClijlUUUZOJlNU IOUUUU ^CIIC piPUl^llU 

nl 


gi|21624617|reflNP 081018.1 1 RIKEN cDNA 
1110007M04 [Mus musculus] 


HGl 000266N0_0_genejpredictionl 


gi|25070241 IrefjXP 192786.1 proline rich 
protein expressed in brain [Mus musculus] 


HG1000266N0 160000 gene__predictio 
nl 


gi|12584972|reflNP 075021.1| lipin 3 [Mus 
musculus] 


HGl 000267N0_5000_gene_predictionl 


gi|26340094|dbj|BAC33710.1| unnamed protein 
product [Mus musculus] 


HG1000270N0 160000 _gene_predictio 
nl 


gi|6679937|reflNP 032110.1| glyceraldehyde- 
3-phosphate dehydrogenase [Mus musculus] 


HGl 000271 NO 10000_gene_predictiQn 
1 


gi|12844196|dbj|BAB26273.1| unnamed protein 
product {Mus musculus] 


HGl 000271 NO 160000 gene_predictio 
nl 


gil26345908|dbj|BAC36605.1 unnamed protein 
product [Mus musculus] 


HG 1 000273N0_1 60000_gene_predictio 
nl 


gi|26345908|dbj|BAC36605.1| unnamed protein 
product [Mus musculus] 


HG1000295N0 160000_gene_predictio 
nl 


gi|20888943|reflXP_129258.1| cDNA sequence 
AF233884 [Mus musculus] 


HG1000296N0 160000_gene_predictio 
nl 


gii21313266|reflNP_080089.1| RIKEN cDNA 
1200003006 [Mus musculus] 
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JtlLriuuu/yyiNU iouuuu gene_prcuit.Lio 
nl 


5i|25054735|reflXP 192839.1 1 ATPas, class II, 
;ype 9B [Mus musculus] 


HG1000300NQ_10000 ^enejrediction 
1 


gi|6753882|reflNP_034349.1| FK506 binding 
protein 4 (59 kDa) [Mus musculus] 


HGl 000306N0_0_gene__predictionl 


gi|25024769|ref|XP_207136.1| similar to 0RF2 
Mus musculus domesticus] 


rlLrl UUUJUoiN u_u_gene_preaicuonz 




HG1000312N0 160000 geiie_predictio 
nl 




HG1000314N0_1000 ^ene_predictionl 


gi|4506283|reflNP_003454.1| protein tyrosine 
phosphatase type IV A, member 1 ; Protein 
tyrosine phosphatase IVAl [Homo sapiens] 


HG1000315N0_160000 _gene_predictio 
nl 


gi|4506285|reflNP_003470.1| protein tyrosine 
phosphatase type IVA, member 2, isoform 1; 
protein tyrosine phosphatase IVA; protein 
tyrosine phosphatase IVA2; phosphatase of 

rArr^nAm+i-ncy liv(=*r ri-TmTin QnniPTi«;l 
rcgcncrauii^ iivci ^ |_nuiiiu oapi\?iioj 


JaLrlUUUJjUJNU louuuu gene_preaiciio 
n2 


gi|6679553|reflNP_033003.1| protein tyrosine 
phosphatase non-receptor type 2 [Ivdus 
musculus] 


JtlijriuuujJUJNU louuuu genej)reaicTio 
n4 


gi| 1 286038 8 |dbj |BAB3 1 939.1 1 unnamed protein 
product [Mus musculus] 


HG1000332N0_1 0000_gene_prediction 


gi|26344091 |dbj|BAC35702.1| unnamed protein 
product [Mus musculus] 


HG1000337N0_1 000_gene__predictionl 


gi|20987322|gb|AAH30185.1| Unknown 
(protein for MGC:29401) [Mus musculus] 


HGl 00034 1N0_5 O00_gene_predictionl 


gi|4506725|ref|NP_000998.1| ribosomal protein 
S4, X-linked X isoform; 40S ribosomal protein 
S4, X isoform; ribosomal protein S4X isoform; 
single-copy abundant mRNA; cell cycle gene 2 
[Homo sapiens] 


HG1000341N0_10000_gene_prediction 
1 


gi|26332837|dbj|BAC30136.1| unnamed protein 
product [IVtus musculus] 


HGl 000353N0_1 60000_gene_predictio 
nl 


gi|17157989|ref|NP_473384.1| Musashi 
homolog 2 (Drosophils) [NIus musculus] 


HG1000357N0 20000_gene_prediction 
1 


gi|25021483|reflXP_207941.1| similar to 
Rctrovirus-rcl&tcd. POL polyprotcin [IVIus 
musculus] 


HG1000358N0_5000 _gene_predictionl 


gil27372319|dbj|BAC53724.1| Piccolo [Mus 
musculus] 


HG1000359N0_160000_gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 000363N0_160000_gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 |Mus 
musculus domesticus] 



197 



wo 2005/005597 



PCT/US2003/027106 



FPID 


Fantom Top Hit Annotation 


nl 


gi| 1 9484 1 26 |gb IAAH25 846. 1 1 Unknown 
protein for MGC:32383) [Mus musculus] 


■PT^l nnn'^/iTWO l/^nOOO af»np nrerlirtio 

nVJl vvUJO /IN V lOUvUU Ji^CliC yJl cm Q li\J 

nl 


gi| 1 392 8676 IrefjNP 1 1 3687. 1 1 proline rich 
3rotein 2 [Mus musculus] 


nl 


gi|20863632|reflXP 164160.1| hypothetical 
3roteinXP_164160 [Mus musculus] 


JtHjrlUUUj5'UlNU_lUUUU jjCDS pfCUH^lluii 


gi|3599320|gb|AAC72793.1 0RP2 [Mus 
musculus domesticus] 


HG1000390N0_5000 _gene_predictionl 


gi|20892585 refjXP 147977. 1| RIKEN cDNA 
2610001E17 [Mus musculus] 


jtiLriuuujyiiNU iDuuww_geiio_prcuit/Liu 
nl 


gi|20892585 refjXP 147977.1] RIKEN cDNA 
2610001E17 [Mus musculus] 


jtiijriuuujyoiNU louuuu gene prpuicuu 
n2 


gi|26330368|dbj|BAC28914.1| unnamed protein 
product [Mus musculus] 


HG1000401N0_10000 _gene_predlction 




HG1000407N0_160000 _gene_predictio 


gi|12853695|dbj|BAB29819.1 1 unnamed protein 
product [Ivlus miisculus] 


rHjrluUU4UoiNU louuuu gens prcun^uu 
n2 


gi|25029560|reflXP_203691.1 1 similar to 
PROBABLE POL POLYPROTEIN [Mus 
musculus] 


rlOlUUlFU'tiNU louuuu gencjircuiouu 
n2 


gi|26326871 |dbj|BAC27179.1| unnamed protein 
product [Mus musculus] 


xHjrluuu4iorNU louuuu _gcnc_prcuicLio 
nl 


gi|20902061|reflXP 147959.1 1 hypothetical 
protein XP_1 47959 [Mus musculus] 


rHjfiUUU^ZoiNU louuuu _gcnc__prcuicuu 
nl 


gi|25032567|reflXP 207391. 1| similar to 0RF2 
[Mus musculus domesticus] 


HG1000429N0 160000 gene_predictio 
nl 


gi|25022040|reflXP_204233.1| similar to 0RF2 
[Mus musculus domesticus] 


HGl 00043 1 N0_20000_gene__prediction 
1 


gi|26339864|dbj|BAC33595.1| unnamed protein 
product [Mus musculus] 


nl 


gi|8394057|reflNP_058565.1| low density 
lipoprotein receptor-related protein 4; low 
density lipoprotein-related protein 4; Low 
Density Lipoprotein Receptor Related Protein 
4; corin [Mus musculus] 


xjr^i nnn/i4ixm lAnnnn at^p m-pAiriin 
nl 


gi|26340972|dbj |B AC34 1 48 . 1 1 unnamed protein 
product [Mus musculus] 


HG1000441N0 160000 gene_predictio 
n2 


gi| 12836479|dbj IBAB23675 . 1 1 unnamed protein 
product [Mus musculus] 


HGl 000446N0_1 60000_gene_predictio 
nl 


gi|25029827|reflXP_207226.1| similar to 0RF2 
[Mus musculus domesticus] 


HG1000446N0 160000 _gene_predictio 
n2 


gi|25031497|reflXP_207552.1| similar to 
Retrovirus-related POL polyprotein [Mus 
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musculus] 


HG1000449N0 160000 gene_predictio 
n2 ■ 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 00045 INO 160000_gene _predictio 
nl 


gi|25054021|reflXP_19281 1.1| similar to 
Transmembrane protease, seriiie 2 
(Epitheliasin) (Plasmic transmembrane protfein 
X) [Mus musculus] 


HG100045^0 10000_gene_prediction 
1 


gi|20846744|reflXP 144090.1] similar to 
hypothetical protein FLJ12457 [Mus musculus] 


HGl 00046 INO 100bO_^ene__prediction 
1 ~ . 


gi|20824899|ref|XP 144255. 1| hypothetical 
protein XP_144255 [Mus musculus] 


HG1000474N0_5000 ^genejwedictionl 


gi|r2853695|dbj|BAB29819.1| unnamed protein 
product [Mus musculus] 


HG1000476N0_1000_gene_predictionl 


gi|12834707|dbj|BAB2301 1 .1| umiamed protein 
product [Mus musculus] 


nl 


no blast hit 


HG1000499N0 160000^ene_predictio 
nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000500N0 160000 gene_predictio 
111 


gi|20912903|ref|XP 126663.1| RIKEN cDNA 
2410154J16 [Mus musculus] 


HG1000505N0 160000 gene_predictio 
111 


gi|25044951|reflXP 195302.1] similar to 
olfactory receptor MOR256-23 [Mus musculus] 


HGl 0005 09N0 10000_gene _prediction 
1 


gi|2633472 1 |dbj |BAC3 1 06 1 . 1 1 unnamed protein 
product [Mus musculus] 


HGl 0005 lONO 160000_gene_predictio 
nl 


gi|12834707|dbj|BAB2301 1.1| unnamed protein 
product [Mus musculus] 


HGl 0005 13N0 160000 _gene_predictio 
nl 


gi| 1 285 9663 |dbj |BAB3 1 727. 1 1 unnamed protein 
product [Mus musculus] 


HGl 0005 1 9N0_1 60000_gene_prBdictio 
nl 


gi|l 19146|sp|P20001 |EF1 1_CRIGR Elongation 
factor 1-alpha 1 (EF-l-alpha-1) (Elongation 
factor 1 A-1) (eEFlA-1) (Elongation factor Tu) 
(EF-Tu) 


HG 1 00052 1N0_1 60000_gene_predictio 
nl 


gi|2495301|sp|Q63934|BR3B_MOUSE Brain- 
specific homeobox/POU domain protein 3B 
(BRN-3B) (BRN-3.2) 


HG 1 000524N0_1 60000_gene_predictio 
nl 


gi]21280325|dbj|BAB96760.1| type XXVI 
collagen [Mus musculus] 


HG1000530N0 20000 gene_prediction 
1 


gi]6679921|reflNP_032102.1| gamma- 
aminobutyric acid (GABA-A) receptor, subunit 
rho 2 [Mus musculus] 


HG1000530N0 160000_gene_predictio 
n2 


gi|23622684|ref|XP_156394.2| expressed 
sequence AL023001 [Mus musculus] 



wo 2005/005597 



PCT/US2003/027106 



FPID 


Fantom Top Hit Annotation 


HGr000534N0 160000 gene predictio 
nl 




HG 1 000545N0 1 60000 gene predictio 
nl 


gil3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


nl 


gi|26341288|dbj|BAC34306.1| umiamed protein 
product [Mus musculus] 


HG1000549N0 160000 gene_predictio 
n2 


gi|3599320lgb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000549N0_160000_gene_predictio 


gi|21312126|reflNP_081 135.1] RIKEN cDNA 
111 0068E 1 1 [Mus musculus] 


HGl 0005 5 3N0_1 60000_gene_predictio 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000560N0 160000 gene predictio 
n2 


gi|25032555|reflXP_207412.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 


nl 




HG 1 0005 66N0 40000 gene prediction 
1 


gi|20856064|reflXP 151615.1] hypothetical 
protein XP_ 151615 [Mus musculus] 


HG 10005 66N0 1 60000_gene_predictio 
nl 




HG1000582N0 160000 gene__predictio 
nl 


gi]7656873]reilNP_056579.1] RiKEN cDNA 
5730583K22 gene [Mus musculus] 


HG1000598N0 160000 gene predictio 
nl 


gi|45 12261 |dbj]BAA75227. 1 1 neurochondrin-2 
[Mus musculus] 


HG1000606N0 20000_gene_prediction 


gi|19527094|reflNP 598640.1] expressed 
sequence AI327031 [Mus musculus] 


HG 1 000607N0 1 60000_gene_predictio 
nl 


gi]25058382]reflXP 206318.1] hypotl:ietical 
protein XP 2063 1 8 [Mus musculus] 


1 


gi]3599320lgblAAC72793.1] 0RF2 [Mus 
musculus domesticus] 


HGl 0006 1 6N0_1 000_gene_predictionl 


gi]263879411dbj]BAC25633.1| umiamed protein 
product [Mus musculus] 


HG 1 000622N0_1 60000_gene_predictio 




HGl 000623N0_1 60000_gene_predictio 


gi]209041 291reflXP_l 55605 . 1 ] hypothetical 
protein XP 155605 [Mus musculus] 


HG1000624N0 160000_gene_predictio 
nl 


gi]13542693|gb]AAH05553.1] putative chloride 
channel (similar to Mm Clcn4-2) [Mus 
musculus] 


HG1000625N0_160000 _gene_predictio 
nl 


gil20901495]reflXP_140099.11 RIKEN cDNA 
9130404H23 [Mus musculus] 


HG1000628N0 40000 _gene_prediction 
1 


gi|3599320]gb]AAC72793.1| 0RF2 [Mus 
musculus domesticus] 
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HG1000628N0 20000 gene_prediction 
1 


gi|26339720|dbj|BAC33523.1| unnamed protein 
product [Mus musculus] 


HG1000638N0_5000 _gene_predictionl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 000642N0_1 60000_gene_pre(iictio 
nl 




HGl 000646N0_1 60000 _gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
nusculus donicsticus] 


nl 


gi|25049717|reflXP_l 49640.2] similar to gene 
Dbp73D protein - fruit fly (Drosophila 
melanogaster) [Mus musculus] 


rlijrl uuuo juiNU iouuv/v/ ^ciic picuic/iiu 

nl 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


nvjl UUUO jZINU louuuu >;ciic picun^tiu 

n2 


gi|26377673|dbj|BAC25377.1| unnamed protein 
product [Mus musculus] 


ur^i nnn/^^Axrn i^nnnn o-Anp nrprliVHn 

rlOlUUUO^DlNU lOUUUv/ Js^CXiC piPUil^uu 

nl 


gi| 1 3 3 84666|reflNP 0795 83 . 1 1 nuclear receptor 
binding factor 2 [Mus musculus] 


jlOr 1 uuuo J OiN yj 1 owuuu_gciic_|jicujt'Uu 
n2 


gi|25050704|reflXP 133465.2] RIBCEN cDNA 
2410004H02 [Mus musculus] 


xHjlUUUOjyiNU zuuuu gene preuioiiuii 
1 


gi|25050704|reflXP 133465.2] RIKEN cDNA 
2410004H02 [Mus musculus] 


HG1000661N0 20000 gene_prediction 
1 


gi|263337331dbjlBAC30584.1] unnamed protein 
product [Mus musculus] 


HG1000664N0 160000 _gene_predictio 
nl 


gi|27372319|dbj|BAC53724.1| Piccolo [Mus 
musculus] 


"HTil nnnfi7n>Jn l^OOOn aenp nreHictio 

XlkJll/l/UU /V/iNV/ lUUV/UU >^C11C iJlPU-iVrfim 

nl 


gi|6680195|ref|NP_032255.1| histone 
deacetylase 2; DNA segment, Chr 10, Wayne 
State University 179, expressed [Mus 
musculus] 


n2 


gi]17313266|ref|NP 478121.1] RecQ protein- 
like 4 [Mus musculus] 


LxXJ 1 UUUOy WIN U ZUUWU j^cllC piCUlt/iiuii 

1 




XJi^l nnA/^Onxrn OC\C\C\C\ o*»np nrpHirtinn 
IIXj i UvUOy V/iNU , ZUWW\/_^cilC pi cuic- liuii 

2 


gi|26340662]dbj|BAC33993.1| unnamed protein 
product [Mus musculus] 


HG1000696N0 20000 gene prediction 
1 


gi|26340662]dbj|BAC33993.1| unnamed protein 
product [Mus musculus] 


HG1000696N0 40000_gene_prediction 
1 


gi|26326171]dbj|BAC26829.1| unnamed protein 
product [Mus musculus] 


HG 1 000697N0_1 60000_gene_predictio 
nl 


gi|25024387|reflXP_207341.1| hypothetical 
protein XP_207341 [Mus musculus] 


HG1000700N0_160000_gene_predictio 
n2 


gil26351279|dbjlBAC39276.1] unnamed protein 
product [Mus musculus] 


HG1000704N0 160000 gene predictio 


gi|21644579|reflNP 660253.11 Williams- 
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nl 


Jeuren syndrome critical region gene 17 [Mus 
musculus] 




gi|23273683|gb|AAH37239.1| Similar to 
3CL2-associated athanogene 4 [Mus musculus] 


nl 


gi|12856848|dbj|BAB30802.1| umiamed protein 
product [Mus musculus] 


JrlVJlvUU /^"INW lUV/V/UU __gCllc_jjicuJ.»-'U»-i 

nl 


gi|26339470|dbj|BAC33406.1| umiamed protein 
product [Mus musculus] 


HGl 00073 9N0 160000 gene_predictio 
n2 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 




gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


TTOl nnn7 J.'^'Nin l^^nnnn ae^ne nredictio 
nl 


gi|23601536|reflXP_130965.2| Nice-4 protein 
aiomolog [Mus musculus] 


Tj"/"* 1 nnn'7'70'\Tn lAnfinn o-i»nA -nrpdirtio 

XlVJlUUU/ /yiNU_lDUwU UCIIC piCUiL^UU 

nl 


gi|2627027|dbj1BAA23475.1| Ftp-1 [Mus 
musculus] 


HGl 00078 1N0_1 60000 _gene_predictio 
nl 


gi|25023334MXP_204722.1| similar to 
formin [Mus musculus] 


HG1000781N0_160000 _gene_predictio 
n2 


gi|26350877|dbj|BAC39075.1| unnamed protein 
product [Mus musculus] 


HG1000786N0_160000 _gene_predictio 


gl|25023581|reflXP_207103.1| similar to 
Retrovirus-related POL polyprotein [Mus 
musculus] 




gi|26340832|dbj|BAC34078.1| unnamed protein 
product [Mus musculus] 


HG 1 000799N0_20000_gene_prediction 


gi|20847912|reflXPJ44610.1| similar to 
KIAA1904 protein [Homo sapiens] [Mus 
musculus] 


HG1000808N0 160000 _gene_predictio 
nl 


gi|26345960|dbj|BAC3663 1.1 1 unnamed protein 
product [Mus musculus] 


HG1000817N0_160000 _gene_predictio 
nl 


gi|2088223 1 |reflXP_l 39203 . 1 1 similar to 
KIAA0858 protein [Homo sapiens] [Mus 
musculus] 


HG1000822N0_20000 _gene_prediction 
1 


gi|13242237|ref|NP_077327.1| Heat shock 
cognate protein 70j heat shock 70kD protein 8 
[Rattus norvegicus] 


HG1000824N0_1 60000 _gene_predictio 
nl 


gi|oooUiyj|rei|JNr_Uj/zjj.i 1 msione 
deacetylase 2; DNA segment, Oir 10, Wayne 
State University 179, expressed [Mus 
musculus] 


HGl 000824N0_1 0000_gene_prediction 
1 


gi|20883564|rellXP_152815.1| hypothetical 
protein XP_1 5281 5 [Mus musculus] 


HG1000839N0_160000_gene_predictio 
nl 


gi|20883564|refpa'_152815.1 1 hypothetical 
protein XP 152815 [Mus musculus] 
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HG1000842N0 160000_gene_predictio 
nl 


gi|26339496|dbj|BAC33419.1| unnamed protein 
product [Mus musculus] 


HG1000842N0 160000_gene_predictio 
n2 


gi|3599320|gb|AAC72793.1| ORF2 [Mus 
musculus domesticus] 


HG1000869N0 160000_gene_predictio 
nl 


gi|671 5564ireflNP_032607. 1 1 melanoma 
antigen, 80 kDa [Mus musculus] 


HG1000870N0 160000_gene_predictio 
nl 


gi|2088 1 1 74|reflXP_147875 . 1 1 hypothetical 
protein XP 147875 [Mus musculus] 


HG1000870N0 160000 gene_predictio 
n2 ■ - ■ 


gi|27369942|reflNP_766246. 1| hypothetical 
protein 95 3005 1 F04 [Mus musculus] 


HG1000878N0 20000 gene_prediction 
1 


gi|27369942|reflNP_766246.1| hypothetical 
protein 953005 1F04 [Mus musculus] 


HG1000878N0 20000_gene_prediction 
2 


gi|27369942|reflNP_766246. 1 1 hypothetical 
protein 953005 1F04 [Mus musculus] 


HG1000904N0 160000_gene_predictio 
n2 


gi|27369942|reflNP_766246. 1 1 hypothetical 
protein 953005 1F04 [Mus musculus] 


HG1000904N0 40000^ene_prediction 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 000906N0_5000_gene_predictionl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000906N0 160000_gene_predictio 
n2 


gi|20836822|reflXP_l 30277.1] similar to 
Plakophilin 4 (p0071) [Mus musculus] 


HG1000910N0 160000_gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000948N0 160000_gene_predictio 
nl 


gi|26325846|dbj|BAC26677.1 1 unnamed protein 
product [Mus musculus] 


HG1000955N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HG1000959N0 160000_gene_predictio 
nl 


gi|7670427|dbj|BAA95065.1| unnamed protein 
product [Mus musculus] 


HG1000959N0_5000_gene_predictionl 


gi|22507385MNP_081019.1| RIKEN cDNA 
1 1 10014F12 [Mus musculus] 


HG 1 000990N0_5000_gene_predictionl 


gi|22507385|reflNP 081019.1] RIKEN cDNA 
1 1 10014F12 [Mus musculus] 


HG1000994N0 10000 gene_prediction 
1 


gi]10946762|ref|NP 067382.1] triggering 
receptor expressed on myeloid cells 3; 
triggering receptor expressed on monocytes 3 
[Mus musculus] 


HG1000994N0_160000 _gene_predictio 
n2 


gi]12855175]dbj]BAB30238.1] unnamed protein 
product [Mus musculus] 


HGl 000994N0_1 0000_gene_prediction 
2 


gill2855175]dbj]BAB30238.1] unnamed protein 
product [Mus musculus] 


HGIOOIOOINO 160000_gene_predictio 
nl 


gi] 1 2 85 5 1 75 ]dbj ]BAB30238. 1 ] unnamed protein 
product [Mus musculus] 
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HG100100lN0_0 _gene_predictionl 


gi|26337385|dbj|BAC32378.1| unnamed protein 
jroduct [Mus musculus] 


HG1001002N0 160000 gene_predictio 
nl 


gi|27370034|reflNP 766297.1] hypothetical 
arotein A530025J20 [Mus musculus] 


HG1001003N0 0 gene_predictionl 


gi|20348159|reflXP 111588.1] similar to 
TRAV9D-3 [Mus musculus] 


HG1001007N0 160000 gene_predictio 
n2 


gi]27370034|reflNP 766297.1] hypothetical 
protein A5 30025 J20 [Mus musculus] 


HGl 00 1 0 1 1N0_1 60000_gene_predictio 


gill3097000]gb]AAH03291.1] Similar to 
liypothetical protein FLJ10342 [Mus musculus] 


HG1001011N0_160000_gene_predictio 


gil26336525ldbj|BAC3 1 945 . 1 ] unnamed protein 
product [Mus musculus] 


nl 


gi]25047957lreflXP_l 30582.2] similar to 
hypothetical protein MGC14161 [Homo 
sapiens] [Mus musculus] 


HG 1 00 1 0 1 4N0_5000_gene_predictionl 


gil263373851dbjlBAC32378.11 imnamed protein 
product [Mus musculus] 


XlVJlUUlUl /INU 10UUUU__gCIXC_piCUJ.t'llU 

nl 


gi]26337385]dbj]BAC32378.1] uimamed protein 
product [Mus musculus] 


JtiO 1 UU 1 UZUiNU 1 OUUUU_^CllC_|JICUlt'U.U 

nl 


gi|25019831]reflXP 207463.1] similar to 
CD59B [Mus musculus] 


Jn.VJlUUlU/'H'iNv/ lOWUWU ^CIIC piCUiCllw 

nl 


gi]26338976]dbj]BAC33159.1| uimamed protein 
product [Mus musculus] 


HG1001024N0 160000 gene_predictio 
n2 


gi]20915148]reflXP 149841.1] hypothetical 
protein XP_1 49841 [Mus musculus] 


HG1001031N0 160000 gene_predictio 
nl 


gi|20915148]reflXP 149841.1] hypothetical 

protein XP_1 49 841 [Mus musculus] 


HG1001035N0_5000_gene_predictionl 


gi|25071690lreflXP 193591.1] hypothetical 
protein XP_1 93591 [Mus musculus] 


HG1001043N0_160000 _gene_predictio 
nl 


gi]263472491dbj|BAC37273. 1 1 unnamed protein 
product [Mus musculus] 


HG1001046N0 5000 gene_predictionl 


gi|6678714]reflNP_032537.1]lymphoid- 
restricted membrane protein [Mus musculus] 


nl 


gil25048969|reflXP_143803.3] similar to 
bA401.1 (novel protein) [Homo sapiens] [Mus 
musculus] 


HG1001047N0 1000 gene_predictionl 


gi]25021180|reflXP_207917.1] similar to RNP 
particle component [Mus musculus] 


HG1001048N0 160000_gene_predictio 
nl 


gi|26353724]dbj|BAC40492. 1] unnamed protein 
product [Mus musculus] 


HGl 001 048N0_1 60000_gene_predictio 
n2 


gi|20343845]reflXP_l 09652.1] similar to 
hypothetical protein FLJ2521 7 [Homo sapiens] 
[Mus musculus] 


HG1001144N0 20000 gene prediction 


gi|20346197|reflXP 110161.1] RAN binding 
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1 


protein 1 [Mus musculus] 


n/^l nm 1 /IQTVTn 1^nflf»f» noma n-n^Air^n 

jtivj 1 uu 1 iH-oiNU louuuu gcnc_prctui/Lio 
n2 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


Jtivjriuui 1 //iNU louuuu genc_prctiii/Lio 
nl 


gi|26339628|dbj|BAC33485.1| unnamed protein 
product [Mus musculus] 


HG1001172N0_20000_gene_prediction 


gi|22122489|reflNP_666128.1 1 hypothetical 
protein MGC38936 [Mus musculus] 


HGl 00 1 1 87N0_1 60000_gene_predictio 
nl 


gi|26340706|dbj|BAC34015.1| unnamed protein 
product []\d!iis miisculus] 


noiUUiiy/JNU louuuu gene preciiciio 
nl 


gi|l 8497290|reflNP_084056.1 1 protein kinase 
r3.f 1 j murine SH^rconid- 361 1 oncogene Ij 
sarcoma 361 1 oncogene [Mus musculus] 


rivjiuuiiy4JNU louuuu _gene_preciiciio 
nl 


musculus domesticus] 


tiyj LXjyjiL yy JNU 1 oUUUU gene preaicuo 
nl 


protein XP_1 32241 [Mus musculus] 


jtujriuui lyyiNU louuuu gene preuictiu 
n2 


gi|20071068|gb|AAH27341.1| Similar to 
elongation factor G2 [Mus musculus] 


xTvjiuui//UiNU lOUUUU gene preuicuo 
nl 


gi|20071068|gb|AAH27341.1| Similar to 
elongation factor G2 [Mus musculus] 


riCj 1 UU 1 /zi JNU loUUUU^ene_preQiciio 
nl 


o-il'^nQflST'^^lrpflyP TJ^^QR 11 Qi'milnr tn lipliY- 
gi zuyuo / 0 j|rci|y\x i/iz.jyo.i| oiiiiiiai lu iiciia- 

destabilizing protein - rat [Mus musculus] 


trniAAITJOXTA l/^AAAA rra-no -iiw/li/vh'rt 

rlvjluui/zyiNU louuuu gene prcuictio 
n2 


gi|25024769|ref|XP 207136.1] similar to 0RF2 
[Mus musculus domesticus] 


HGl 001 230N0_5000_gene_prediction 1 


gi|6754206|reflNP 034568.1] hexokinase 1; 
downcast anemia [Mus musculus] 


jtHjriuui/jjiNU louuuu gene preciiciio 
nl 


gi]12857205]dbj]BAB30930.1] unnamed protein 
product [Mus musculus] 


xHjtIUUI/j jJNU wwu gene_preciicuon 
1 


gi|21 70391 8]ref]NP 663438.1] hypothetical 
protein BC0241 1 8 P4us musculus] 


TJT/T. 1 AA 1 T2 ^XTA OAAAA rr^i^A 1-^■ro^^1r^f1/-»n 

rlLrlUUl/i*)JNU zUUUU_^eiie_preoiciion 

1 


gi 26339338] dbj |BAC33340.1] unnamed protein 
product [Mus musculus] 


jtioiuul/jjiNU louuuu gene preuit-uu 
n2 


gi|26339338]dbj|BAC33340.1| unnamed protein 
product [Mus musculus] 


WmnniO'^^lsJn l^nOOO o-p-nf* nrfvlirfio 
XivJ 1 1 vyjyjKjyj gciic_pi cuiC' uu 

n3 


gi|26340904]dbj|BAC341 14.1| unnamed protein 
product [Mus musculus] 


rnjriuui/iOUiNU louuuu _gcnc_prcuioinj 
nl 


gi]26327795|dbj]BAC27638.1] unnamed protein 
product [Mus musculus] 


HG 1 00 1 260N0_40000_gene_prediction 
1 


gi|8922328]reflNP_0605 1 7. 1 1 hypothetical 
protein FLJ 10290 [Homo sapiens] 


HG1001264N0 160000 gene_predictio 
nl 


gi|8922328|reflNP_0605 17.1 1 hypothetical 
protein FLJ10290 [Homo sapiens] 


HG1001274N0 160000 gene predictio 


gi|26383198]dbi|BAC25520.1| unnamed protein 
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nl 


product [Mus musculus] 


HG1001284N0 160000 gene_predictio 
nl 


gi|3599320|gb|AAC72793.1| 0RF2 [Mus 
musculus domesticus] 


HGl 001 284N0_1 60000_gene_predictio 


gi|26326843|dbj|BAC27165.1| unnamed protein 
3roduct [Mus musculus] 


HGl 00 1 292N0_1 60000_gene_predictio 


gi|26326843|dbj|BAC27165.1| uimamed protein 
sroduct [Mus musculus] 


HGl 00 1 3 02N0_1 60000_gene_predictio 


gi|13097342|gb| AAH03421.il Similar to 
ATPase, H+ transporting, lysosomal (vacuolar 
proton pump) 31kD [Mus musculus] 


HGl 001 3 1 3N0_1 60000_gene_predictio 


gi| 1285263 1 |dbj|BAB29486. 1 1 mmamed protein 
product [Mus musculus] 


nl 


gi|25053141|reflXP_193739.1| similar to 
betaine-homocysteine methyltransferase 
[Rattus norvegicus] [Mus musculus] 


HGl 00 1 328N0_5000_gene_predictionl 


gi|26347687|dbj|BAC37492.1| unnamed protein 
product [Mus musculus] 


HG1001328N0 40000_gene_prediction 
1 


gi|26352918|dbj|BAC40089.1| mmamed protein 

product [Mus musculus] 


HGl 00 1 33 lNO_0_genejpredictionl 


gi|3599320|gb|AAC72793.1| ORr'2 [Mus 
musculus domesticus] 


nl 


gi|20381292|gb|AAH27770.1| stromal cell 
derived factor receptor 2 [Mus musculus] 


HG1001335N0 160000 gene_predictio 
n2 


gi|2193870|dbj|BAA20419.1| reverse 
transcriptase [Mus musculus] 


HG1001348N0 160000 gene_predictio 
nl 


gi|2193870|dbj|BAA20419.1| reverse 
transcriptase [Mus musculus] 


HGl 00 1 349N0_1 60000_gene_pre(iictio 
nl 


gi|2084653 8|reflXP_l 50033 . 1 1 hypothetical 
protein XP 1 50033 [Mus musculus] 


HGl 001 354N0_1 60000_gene_predictio 


gi|7305215|reflNP_038599.1| kinase suppressor 
of ras [Mus musculus] 


HG1001361N0 160000 gene_predictio 
nl 


gi|6678690|reflNP_032525.1| LIM homeobox 
protein 5; LIM homeo box protein 5 [Mus 
musculus] 


nl 


gi|20345901|reflXP 109824.1] hypothetical 
protein XP_1 09824 [Mus musculus] 


HGl 00 1 3 76N0_5000_gene_predictionl 


gi|27261816|reflNP 080861.1] RIKEN cDNA 
C530005J20 [Mus musculus] 


HGl 001376N0_20000_^ene_prediction 


gi]27261816]ref]NP_080861.1| RIKEN cDNA 
C530005J20 [Mus musculus] 


HG 1 00 1 3 76N0_5000_gene_piediction2 


gi]27261816]refpsfP_080861.1] RIKEN cDNA 
C530005J20 [Mus musculus] 


HG1001376N0_5000 _5ene_prediction3 


gi]27261816]reflNP_080861.1| RIKEN cDNA 
C530005J20 [Mus musculus] 
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HG1001417N0 160000 gene_predictio 

nl ~ , ~ 


gi|27261816|reflNP 080861.1] RIKEN cDNA 
C530005J20 [Mus musculus] 


HG 1 00141 7N0_1 000_gene_predictionl 


gi|26349767|dbj|BAC38523.1| uimamed protein 
product [Mus musculus] 


HGl 00141 7N0 160000_gene_predictio 
n2 


gi|26349767|dbj|BAC38523.1| unnamed protein 
product [Mus musculus] 


HG1001417N0 160000_gene_predictio 

n3 „.' , ~. . ' 


gi|26349767|dbj|BAC38523.1| unnamed protein 
product [Mus musculus] 


HGl 001436N0^500b_gene_predictionl 


gi|26349767|dbj|BAC38523.1| unnamed protein 
product [Mus musculus] 


HG1001436N0 20000 gene_prediction 


gi|20987280|gb|AAH29643.1| Unknown 
03rotein for MGC:25768) [Mus musculus] 


HG1001436N0 160000_gene_predictio 
nl 


gi|25051637|reflXP 194491.1] RIKEN cDNA 
1 1 10053F02 [Mus musculus] 


HG1001439N0 160000 gene predictio 
nl 


gi|25051637|reflXP 194491.1] RIKEN cDNA 
1 1 1 0053F02 [Mus musculus] 


HG1001484N0 160000 gene _predictio 
nl 


gi]6753290]reflNP 033943.1] calsequestrin 1 
[Mus musculus] 


HG1001485N0 10000 gene _prediction 
1 


gi|25029827 reflXP 207226.1] similar to 0RF2 
[Mus musculus domesticus] 


HG1001500N0 160000^ene_predictio 
nl 


gi|3599320]gb]AAC72793.1] 0RF2 [Mus 
musciiliis domesticus] 


HG1001500N0 160000_gene_predictio 
n2 


gi|66791081reflNP_032748.1] nucleophosmin 1; 
nucleolar protein N038 [Mus musculus] 


HG1001508N0 160000_gene_predictio 
nl 


gi|2DU2yy2o|rei|Ar_2U /Zj/.1\ smular to 
Retrovirus-related POL polyprotein [Mus 
musculus] 




gi]20340683]reflXP_l 10361.1] similar to 
phospholipase C beta 2 [Rattus norvegicus] 
[Mus musculus] 
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Examples 

[0602] The examples, which are intended to be purely exemplary of the 
invention and should therefore not be considered to limit the invention in any way, 
also describe and detail aspects and embodiments of the invention discussed above. 
The examples are not intended to represent that the experiments below are all or the 
only experiments performed. Efforts have been made to ensure accuracy with respect 
to numbers used (e.g., amounts, temperature, etc.) but some experimental errors and 
deviations should be accounted for. Unless indicated otherwise, parts are parts by 
weight, molecular weight is weight average molecular weight, temperature is in 
degrees Centigrade, and pressure is at or near atmospheric. 

[0603] While the present invention has been described with reference to the 
specific embodiments thereof, it should be understood by those skilled in the art that 
various changes may be made and equivalents may be substituted without departing 
from the true spirit and scope of the invention. In addition, many modifications can 
be made to adapt a particular situation, material, composition of matter, process, 
process step or steps, to the objective, spirit and scope of the present invention. All 
such modifications are intended to be within the scope of the claims appended hereto. 

[0604] Additional objects and advantages of the invention will be set forth in 
part in tlie description which follows, and in part will be obvious from the description, 
or may be learned by practice of the invention. The objects and advantages of the 
invention will be realized and attained by means of the elements and combinations 
particularly pointed out in the appended claims. Moreover, advantages described in 
the body of the specification, if not included in the claims, are not per se limitations to 
the claimed invention. 

[0605] It is to be understood that both the foregoing general description and 
the following detailed description are exemplary and explanatory only and are not 
restrictive of the invention, as claimed. Moreover, it must be understood that the 
invention.is not limited to the particular embodiments described, as such may, of 
course, vary. Further, the terminology used to describe particular embodiments is not 
intended to be limiting, since the scope of the present invention will be limited only 
by its claims. 

[0606] With respect to ranges of values, the invention encompasses each 
intervening value between the upper and lower limits of the range to at least a tenth of 
the lower lunit's unit, unless the context clearly indicates otherwise. Further, the 
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invention encompasses any other stated intervening values. Moreover, the invention 
also encompasses ranges excluding either or both of the upper and lower limits of the 
range, imless. specifically excluded from the stated range. 

[0607] Unless defined otherwise, the meanings of all technical and scientific 
terms used herein are those commonly understood by one of ordinary skill in the art to 
which this invention belongs. One of ordinary skill in the art will also appreciate that 
any methods and materials similar or eqiiivalent to those described herein can also be 
used to practice or test the invention. Further, all publications mentioned herein are 
incorporated by reference. 

[0608] It must be noted.that, as used herein and in tiie appended claims, the 
singular forms "a," "or," and "the" include plural referents unless the context clearly 
dictates otherwise. Thus, for example, reference to "a subject polypeptide" includes a 
plurality of such polypeptides and reference to "the agent" includes reference to one 
or more agents and equivalents thereof known to those skilled in the art, and so forth. 

[0609] Further, all numbers expressing quantities of ingredients, reaction 
conditions, % purity, polypeptide and polynucleotide lengths, and so forth, used in the 
specification and claims, are modified by the term "about," unless otherwise 
indicated. Accordingly, the numerical parameters set forth in the specification and 
clauns are approximations that may vary depending upon the desired properties of the 
present invention. At the very least, and not as an attempt to limit the application of 
the doctrine of equivalents to the scope of the claims, each numerical parameter 
should at least be construed in light of the number of reported significant digits, 
applying ordinary rounding techniques. Nonetheless, the numerical values set forth in 
the specific examples are reported as precisely as possible. Any numerical value, 
however, inherentiy contains certain errors fix)m the standard deviation of its 
experimental measurement. 

[06 1 0] The publications discussed herein are provided solely for their 
disclosure prior to the filing date of the present application. Nothing herein is to be 
construed as an admission that the present invention is not entitied to antedate such 
publication by virtue of prior invention. Further, the dates of publication provided 
may be different fi-om the actual publication dates which may need to be 
independently confirmed. 
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Example 1 Expression in E. coli 

[06 1 1 ] Sequences can be expressed in E. coli. Any one or more of the 
sequences according to SEQ ID NOS . : 1 - 1 04 can be expressed in E. coli by 
subcloning the entire coding region, or a selected portion thereof, into a prokaryotic 
expression vector. For example, the expression vector pQE16 from the QIA 
expression prokaryotic protein expression system (Qiagen, Valencia, CA) can be 
used. The features of this vector that make it useful for protein expression include an 
efficient promoter (phage' T5) to drive transcription, expression control provided by 
the lac operator system, which can be induced by addition of EPTG (isopropyl-beta-D- 
thiogalactopyranoside), and an encoded 6XHis tag coding sequence. The latter is a 
stretch of six histidine amino acid residues which can bind very tightly to a nickel 
atom. This vector can be used to express a recombinant protein with a 6XHis. tag 
fused to its carboxyl terminus, allowing rapid and efficient purification using Ni- 
coupled affinity columns. 

[06 1 2] The entire or the selected partial coding region can be amplified by 
PCR, then ligated into digested pQE16 vector. The ligation product can be 
transformed by electroporation into electrocompetent E. coli cells (for example, strain 
M15[pREP4] from Qiagen), and the transformed cells may be plated on ampicillin- 
containing plates. Colonies may then be screened for the correct insert in the proper 
orientation using a PCR reaction employing a gene-specific primer and a vector- 
specific primer. Also, positive clones can be sequenced to ensure correct orientation 
and sequence. To express the proteins, a colony containing a correct recombinant 
clone can be inoculated into L-Broth containing 100 jig/ml of ampicillin, and 25 
[xg/ml of kanamycin, and the culture allowed to grow overnight at 37 degrees C. The 
saturated culture may then be diluted 20-fold in the same medium and allowed to 
grow to an optical density of 0.5 at 600 nm. At this point, IPTG can be added to a 
final concentration of 1 mM to induce protein expression. After growing the culture 
for an additional 5 hours, the cells may be harvested by centrifugation at 3000 times g 
for 1 5 minutes. 

[061 3] The resultant pellet can be lysed with a mild, nonionic detergent in 20 
mM Tris HCl (pH 7.5) (B PER.TM. Reagent from Pierce, Rockford, IL), or by 
sonication until the turbid cell suspension turns translucent. The resulting lysate can 
be further purified using a nickel-containing column (Ni-NTA spin column fix)m 
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Qiagen) under non-denaturing conditions. Briefly, the lysate will be adjusted to 300 
mM NaCl and 10 mM imidazole, then centrifuged at 700 times g through the nickel 
spin column to allow the His-tagged recombinant protein to bind to the column. The 
column will be washed twice with wash buffer (for example, 50 mM NaHi PO4, pH 
8.0; 300 mM NaCl; 20 mM imidazole) and elated with elution buffer (for example, 50 
mM NaH2 P04, pH 8.0; 300 mM NaCI; 250 mM imidazole). All the above 
procedures will be performed at 4 degrees C. The presence of a purified protein of 
the predicted size cap be confirmed with SDS-PAGE. 
Example 2: Expression in Mammalian Cells 

[0614] The sequences encoding the proteins of Example 1 can be cloned into 
the pENTR vector (Invitrogen) by PGR and transferred to the mammalian expression 
vector pDEST12.2 per manufacturer's instructions (Invitrogen). Introduction of the 
recombinant construct into the host cell can be effected by transfection with Fugene 6 
(Roche) per manufacturer's instructions. The host cells containing one of 
polynucleotides of the invention can be used in conventional manners to produce the 
gene product encoded by the isolated fragment (in the case of an ORF). A number of 
types of cells can act as . suitable host cells for expression of the proteins. Mammalian 
host cells include, for example, monkey COS cells, Chinese Hamster Ovary (CHO) 
cells, human kidney 293 cells, human epidermal A43 1 cells, human Colo205 cells, 
3T3 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell 
strains derived from in vitro culture of primary tissue, primary explants, HeLa cells, 
mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. 

Example 3: Expression in Cell-Free Translation Systems 

[061 5] Cell-jfree translation systems can also be employed to produce 
proteuis using RNAs derived from the DNA constructs of the present invention. 
Appropriate cloning and expression vectors containing SP6 or T7 promoters for use 
with prokaryotic and eukaryotic hosts have been described (Sambrook et al., 1989). 
These DNA constructs can be used to produce proteins in a rabbit reticulocyte lysate 
system or in a wheat germ extract system. 

[06 1 6] Specific expression systems of interest include plant, bacterial, yeast, 
insect cell and mammalian cell derived expression systems. Expression systems in 
plants include those described in U.S. Patent No. 6,096,546 and U.S. Patent No. 
6,127,145. Expression systems in bacteria include those described by Chang et al., 
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1978, Goeddel et al, 1979, Goeddel et al, 1980, EP 0 036,776, U.S. Patent No. 
4,551,433; DeBoer et al., 1983, and Siebenlist et al, 1980. 

[0617] Mammalian expression is further accomplished as described in 
Dijkema et al. 1985, Gorman et al., 1982, Boshart et al., 1985, and U.S. Patent No. 
4,399,216. Other features of mammalian expression are facilitated as described in 
Ham and Wallace, Meth. Enz., 1979, Barnes and Sato, 1980, U.S. Patent Nos. . 
4,767,704, 4,657,866, 4,927,762, 4,560,655, WO 90/103430, WO 87/00195, and U.S. 
RE 30,985. 

Example 4: Expression of the Secreted Factors in Yeast 

[061 8] Primers can be designed to amplify the secreted factors using PGR 
and cloned into pENTR/D-TOPO vectors (Invitrogen, Carlsbad, CA). The secreted 
fectors in pENTR/D-TOPO can be cloned into the yeast expression vector pYES- 
DEST52 by Gateway LR reaction (Invitrogen, Carlsbad, CA). The resulting yeast 
expression vectors can be transformed into INVScl strain from Invitrogen to express 
the secreted factors according to the manufacturer's protocol (Invitrogen, Carlsbad 
CA). The expressed secreted factors will have a 6XHis tag at the C-terminal. 
Expressed protein can be purified with ProBond™ resin (Invitrogen, Carlsbad, CA). 

[06 1 9] Expression systems in yeast include those described in Hitmen et al., 
1978, Ito et al., 1983, Kurtz et al., 1986, Kunze et al., 1985, Gleeson et al., 1986, 
Roggenkamp et al., 1986, Das et al., 1984, De Louvencourt et al, 1983, Van den Berg 
et al., 1990, Kunze et al., 1985, Cregg et al. 1985, U.S. Patent No. 4,837,148, U.S. 
Patent No. 4,929,555, Beach and Nurse, 1981, Davidow et al., 1985, Gaillardin et al., 
1985, Ballance et al., 1983, Tilbum et al., 1983, Yelton et al., 1984, Kelly and Hynes, 
1985, EP 0 244,234, and WO 91/00357. 

Example 5: Expression of Secreted Factors in Baculovirus Expression 
System. 

[0620] The secreted factors in pENTR/D-TOPO can be cloned into 
Baculovirus expression vector pDESTlO by Gateway LR reaction (In\dtrogen, 
Carlsbad, CA). The secreted factors can be expressed by the Bac-to-Bac expression 
system from Invitrogen (Carlsbad CA), briefly described as follows. The expression 
vectors containing the secreted factors are transfonned into competent DHlOBac™ E. 
coli strain and selected for transposition. The resulting E coli contain recombinant 
bacmid that contains the secreted factor. High molecular weight DNA can be isolated 
from the E. coli containing the recombinant bacmid and then transfected into insect 
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cells with Cellfectin reagent. The expressed secreted factors will have a 6XHis tag at 
N-terminal. Expressed protein will be purified by ProBond™ resin (Invitrogen, 
Carlsbad, CA). 

[0621] Expression of heterologous genes in insects can be accomplished as 
described in U.S. Patent No. 4,745,051; Doerfler et al, 1087; Friesen et ah, 1986; EP 
0 127,839, EP 0 155,476, Vlalc et al, 1988, Miller et al, 1988, Carbonell et al, 1988, 
Maeda et al., 1985, Lebacq-Verheyden et al., 1988, Smitli et al., 1985, Miyajima et 
al.; and Martin et al., 1988. Numerous baculoviral strains and variants and 
corresponding permissive insect host cells from hosts have been previously described 
(Setlow et al., 1986, Luckow et al., 1988; Miller et al, 1986; Maeda et al.,. 1985). 

Example 6: Primer Design 

[0622] To design the forward primer for PGR amplification, the melting 
point of the fu-st 20 to 24 bases of the primer can be calculated by coimting total A 
and T residues, then multiplying by 2. To design the reverse primer for PGR 
amplification, the melting point of the first 20 to 24 bases of the reverse complement, 
with the sequences written from 5 -prime to 3-prime can be calculated by counting the 
total G and C residues, then multiplying by 4. Both start and stop codons can be 
present in the final amplified clone. The length of the primers is such to obtain 
melting temperatures within 63 degrees G to 68 degrees G. Adding the bases "GAGG" 
to the forward primer renders it compatible for cloning the PGR product with the 
TOPO pENTR/D (Invitrogen, GA). 

Example 7: Reverse Transcriptase Reaction 

[0623] cDNA can be prepared by the following method Between 200 ng 
and 1 .0 \ig mRNA is added to 2 [j1 DMSO and the volume adjusted to 11 pi with 
DEPC-tieated water. One pi Oligo dT is added to the tube, and the mixture is heated 
at 70° G for 5 min., quickly chilled on ice for 2 min., and the mixture is collected at 
the bottom of the tube by brief centrifugation. The following 1 strand components 
are then added to the mRNA mixture: 2 nl lOX Stratascript (Stratagene, CA) 1'* strand 
buffer, 1 III 0.1 M DTT, 1 |al 10 mM dNTP mix (10 mM each of dG, dA, dT and 
dCTP), 1 |al RNAse inhibitor, 3 m,1 Stratascript RT (50 U/ |xl). The contents are gently 
mixed and the mixture collected by brief centrifugation. The mixture is incubated in a 
42° C water bath for 1 hour, placed in a 70° C water bath for 15 min. to stop the 
reaction, transferred to ice for 2 min., and centriftaged briefly in a microfuge to collect 
the reaction product at the bottom of the reaction vessel. Two nl RNAse H is then 
213 



wo 2005/005597 



PCT/US2003/027106 



added to the tube, the contents are mixed well, incubated at 37° C in a water bath for 
20 min., and centrifuged briefly in a microfuge to collect the reaction product at the 
bottom of the reaction vessel. The reaction mixture can proceed directly to PGR or be 
stored at -20° C. 

Example 8: Full Length PGR 

[0624] Full length PGR can be achieved by placing the products of the 
reaction described in Example 1, with primers diluted to 5nM in water, into a reaction 
vessel and adding a reaction mixture composed of Ix Taq buffer, 25 mM dNTP, 10 ng 
cDNA pool, TaqPlus (Stratagene, CA) (5u/ul), PfuTurbo (Stratagene, GA) (2.5u/ul), 
water. The contents of the reaction vessel are then mixed gently by inversion 5-6 
times, placed into a reservoir where 2^1 Fi/Ri primers are added, the plate sealed and 
placed in the thennocycler. The PGR reaction is comprised of the following eight 
steps. Step 1: 95° C for 3 min. Step 2: 94° C for 45 sec. Step 3: 0.5° G/sec to 56-60° 
G. Step 4: 56-60° G for 50 sec. Step 5: 72° G for 5 min. Step 6: Go to step 2, 
perform 35-40 cycles. Step 7: 72° G for 20 min. Step 8: 4° G. 

[0625] The products can then be separated on a standard 0.8 to 1 .0% agarose 
gel at 40 to 80 V, the bands of interest excised by cutting from the gel, and stored at - 
20° G mitil extraction. The material in the bands of interest can be purified with 
QIAquick 96 PGR Purification Kit (Qiagen, GA) according to the manufacturer 
instractions. Gloning can be performed with the Topo Vector pENTR/D-TOPO 
vector (Tnvitrogen, GA) according to the manufacturer's instructions. 
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SEQUENCE LISTING 
[0627] A sequence listing transmittal sheet and a sequence listing in paper 
format accompanies this application. 
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CLAIMS 

1 . A first nucleic acid molecule comprising a polynucleotide sequence 
chosen from at least one polynucleotide sequence according to SEQ ID NOS.: 1-104. 

2. The nucleic acid molecule of claim 1 , wherein the nucleic acid 
molecule is a DNA or a RNA molecule. 

3. An animal injected with the nucleic acid molecule of claim 1 . 

4. A double-stranded isolated nucleic acid molecule comprising the first 
nucleic acid molecule of claim 1 and its complement. 

5 . The nucleic acid molecule of claim 4, wherein the first polynucleotide 
sequence encodes a polypeptide chosen from a polypeptide comprising a signal 
peptide, a mature polypeptide that lacks a signal peptide, a signal peptide, a 
biologically active fragment of a polypeptide, a polypeptide lacking a signal peptide 
cleavage site, a polypeptide consisting essentially of a N-terminal fragment that 
contains a Pfem domain, and a polypeptide consisting essentially of a C-terminal 
fragment that contains a Pfam domain. 

6. A second nucleic acid molecule comprising a second polynucleotide 
sequence that is at least about 70%, or about 80%, or about 90%, or about 95% 
homologous to the first nucleic acid molecule of claim 1 . 

7. A second isolated nucleic acid molecule comprising a second 
polynucleotide sequence that hybridizes to the first polynucleotide sequence of claim 
1 under high stringency conditions. 

8. Tlie second isolated nucleic acid molecule of claim 6, wherein the 
second polynucleotide sequence is complementary to the first polynucleotide 
sequence. 

9. A vector comprising the nucleic acid molecule of claim 1 and a 
promoter that drives the expression of the nucleic acid molecule. 

1 0. The vector of claim 9, wherein the promoter is chosen from one or 
more of a promoter that is naturally contiguous to the nucleic acid molecule, a 
promoter that is not naturally contiguous to the nucleic acid molecule, an inducible 
promoter, a conditionally active promoter, a constitutive promoter, and a tissue 
specific promoter. 

11. A host cell transformed, transfected, transduced, or infected with the 
nucleic acid molecule of claim 1. 
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12. The host cell of claim 11, wherein the cell is chosen from one or more 
of a prokaryotic cell, a eucaryotic cell, a human cell, a mammalian cell, an insect cell, 
a fish cell, a plant cellj and a fungal cell. 

13. A nucleic acid composition comprising a pharmaceutically acceptable 
carrier or a buffer and one or more compositions chosen from the nucleic acid 
molecule of claim 1, the nucleic acid molecule of claim 4, the vector of claim 9, and 
the host cell of claim 11. 

14. , One pr more polypeptide molecules comprising a polypeptide 
sequence chosen from at least one amino acid sequence encoded by SEQ ID NOS.: 1- 
104. 

15. An animal inj acted with the polypeptide molecule of claim 14. 

1 6. The polypeptide of claim 14, wherein the polypeptide has a function 
chosen from an agonist, ian antagonist, a ligand, and a receptor. 

17. The polypeptide of claim 14, wherein the polypeptide is chosen from a 
polypeptide comprising a signal peptide, a mature polypeptide that lacks a signal 
peptide, a signal peptide, a biologically active fragment of a polypeptide, a 
polypeptide lacking a signal peptide cleavage site, a biologically active fragment 
consisting essentially of an N-terminal fragment containing a Pfam domain, and a C- 
terminal fragment containing a Pfam domain. 

18. A polypeptide composition comprising the polypeptide molecule of 
claim 14 and a pharmaceutically acceptable carrier or a buffer. 

19. A cell culture medium comprising the polypeptide of claim 14. 

20. The cell cultiu'e medium of claim 19, further comprising responder 
cells chosen from one or more T cells, B cells, NK cells, dendritic cells, macrophages, 
muscle cells, stem cells, epithelial skin cells, fat cells, blood cells, brain cells, bone 
marrow cells, endothelial cells, retinal cells, bone cells, kidney cells, pancreatic cells, 
liver cells, spleen cells, prostate cells, cervical cells, ovarian cells, breast cells, lung 
cells, liver cells, soft tissue cells, colorectal cells, cells of the gastrointestinal tract, and 
cancer cells. 

2 1 . The cell culture medium of claim 20, wherein the responder cells 
proliferate in the medium. 

22. The cell culture medium of claim 20, wherein the responder cells are 
inhibited in the medium. 
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23. A cell culture comprising trmisfected cells, wherein the transfected 
cells are transfected with the polynucleotide of claim 1 . 

24. The cell culture of claim 23, further comprising responder cells chosen 
from one or more T cells, B cells, NK cells, dendritic cells, macrophages, muscle 
cells, stem cells, epithelial skin cells, fet cells, blood cells, brain cells, bone marrow 
cells, endolheUal cells, retinal cells, bone cells, kidney cells, pancreatic cells, liver 
cells, spleen cells, prostate cells, cervical cells, ovarian cells, breast cells, lung cells, 
liver cells, soft tissue cells, colorectal cells, cells of tlie gastrointestinal tract, and 
cancer cells. 

25. The cell culture of claim 23, wherein the responder cells proliferate in 
the cell culture. 

26. The cell culture of claim 23, wherein the responder cells are inhibited 
in the cell culture. 

27. A method of making a transformed, transfected, transduced, or infected 
host cell comprising: 

(a) providing a composition comprising the vector of claim 9, and 

(b) alloAving a host cell to come into contact with the vector to form a 
transformed, transfected, transduced, or infected host cell. 

28. A method ofmaking a polypeptide comprising: 

(a) providing a nucleic acid molecule that comprises a 
polynucleotide sequence encoding the polypeptide of claim 14; 

(b) introducing the nucleic acid molecule into an e3q)ression 
system; and 

(c) allowing the polypeptide to be produced. 

29. A method of making a polypeptide comprising: 

(a) providing a composition comprising the host cell of claim 1 1 ; 

(b) culturing the host cell to produce the polypeptide; and 

(c) allowing the polypeptide to be produced. 

30. A diagnostic kit comprising a polynucleotide molecule, wherein the 
polynucleotide molecule comprises a sequence chosen from (a) at least 6, (b) at least 
7, (c) at least 8, and (d) at least 9 contiguous nucleotides chosen from the nucleic acid 
molecule of claim 1. 
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31. A diagnostic kit comprising a polypeptide molecule, wherein the 
polypeptide molecule comprises an amino acid sequence or a biologically active 
fragment thereof, derived from the nucleic acid molecule of claim 1 . 

32. A genetically modified mouse comprising a deletion, substitution, or 
modification' of a sequence chosen fix)m SEQ ID NOS.: 1-104, wherein the deletion, 
substitution or modification prevents or reduces expression of said sequence and 
results in a m9use deficient in or completely lacking one or more gene products of a 
sequence chosen from SEQ ID NOS.: 1-104. 

33. A method of determining the presence of the nucleic acid molecide of 
claim 1 or its complement comprising: 

(a) proAdding a complement to the nucleic acid molecule or providing a 
complement to the complement of the nucleic acid molecule; 

(b) allowing the molecules to interact; and 

(c) determining whether interaction has occurred. 

34. A method of determining the presence of an antibody to the 
polypeptide of claim 14 in a sample, comprising: 

(a) providing the polypeptide; 

• (b) alloAving the polypeptide to interact with any specific antibody in the 
sample; and 

(c) determining whether interaction has occurred. 

35. An antibody specifically recognizing, binding to, and/or modulating 
the biological activity.of at least one polypeptide encoded by a nucleic acid molecule 
of claim 1 , or a biologically active Augment thereof. 

36. An antibody composition comprising the antibody of claim 35 and a 
pharmaceutically acceptable carrier. 

37. The antibody of claim 3 5 , wherein the antibody is chosen from one or 
more of a monoclonal antibody, a polyclonal antibody, a single chain antibody, an 
antibody comprising a backbone of a molecule with an Ig domain, a targeting 
antibody, a neutralizing antibody, a stabilizing antibody, an enhancing antibody, an 
antibody agonist, an antibody antagonist, an antibody that promotes endocytosis of a 
target antigen, a cytotoxic antibody, an antibody that mediates ADCC, a human 
antibody, a non-human primate antibody, a non-primate miimal antibody, a rabbit 
antibody, a mouse antibody, a rat antibody, a sheep antibody, a goat antibody, a horse 
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antibody, a parcine antibody, a cow antibody, a chicken antibody, a humanized 
antibody, a primatized antibody, and a chimeric antibody. 

38. The antibody of claim 37, wherein the antibody is produced in a 
manner chosen from in vivo and in vitro. 

39. The antibody of claim 37, wherein the antibody is produced in an 
organism chosen from a prokaryote and a eukaryote. 

40. The antibody of claim 39, wherein the organism is chosen from a 
bacterial cell, a fungal cell, a plant cell, an insect cell, and a mammalian cell. 

41 . The antibody of claim 40, wherein the cell is chosen from a yeast cell, 
an Aspergillus cell, an SF9 cell, a High Five cell, a cereal plant cell, a tobacco cell, 
and a tomato cell. 

42. The cytotoxic antibody of claim 37, further comprising one or more 
cytotoxic component chosen from a radioisotope, a microbial toxin, a plant toxin, and 
a chemical compound. 

43 . The cytotoxic antibody of claim 42, wherein the chemical compound is 
chosen from doxorubicin and cisplatin. 

44. The antibody of claim 35, wherein the antibody has a function chosen 
from specifically inhibiting the binding of the polypeptide to a ligand, specifically 
inhibiting the binding of the polypeptide to a subsfrate, specifically inhibiting the 
bmding of the polypeptide as a ligand, and specifically inhibiting the binding of the 
polypeptide as a substrate. 

45 . A bacteriophage, wherein the antibody of claim 35, or a fiagment 
thereof, is displayed on the bacteriophage. 

46. A bacterial cell comprising the bacteriophage of claim 45. 

47. A non-human animal injected with the antibody composition of claim 

36. 

48. A host cell that secretes the antibody of claim 35 . 

49. A method of malting an antibody, comprising: 

(a) introducing a polypeptide, polynucleotide encoding the 
polypeptide, or a biologically active fiB,gment thereof into an animal in sufficient 
amount to elicit generation of antibodies specific to the polypeptide, wherein the 
polypeptide: 

(i) is encoded by ttie nucleic acid molecule of claim 1; or 

(ii) comprises the polypeptide sequence of claim 14; and 
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(b) recovering the antibodies therefix)m. 

50. The method of claim 49, further comprising after step (a), the step of 
isolating a spleen from the animal injected with the polypeptide or polynucleotide or a 
fragment thereof, and the step of recovering the antibodies from the spleen cells. 

51. The method of claim 50, further comprising the step of making a 
hybridoma using cells from the spleen and selecting a hybridoma that secretes the 
antibodies. 

52. . The iT:iethod of claim 50, further comprising making a polynucleotide 
library from the spleen cells, selecting a cDNA clone that produces the antibodies, 
and expressing the cDNA clone in an expression system to produce antibodies or 
fragments thereof. 

53. A method of modulating biological activity comprising: 

(a) providing the antibody of claim 35; and 

(b) contacting the antibody with a first human or a non-human host 
cell thereby modulating the activity of a first human or non-human animal host cell, 
or a second host cell. 

54. The method of claim 5 3 , wherein the modulation of biological activity 
is chosen from enhancing cell activity directly, enhancing cell activity indirectly, 
inhibiting cell activity directly, and inhibiting cell activity indirectly. 

55. The method of claim 53, wherein the step of contacting the antibody 
with a first human or non-human host cells resuUs in recruitment of the second host 
cell. 

56. The method ofclaim 53, wherein the first host cell is a cancer cell. 

57. The method of claim 53, wherein the first or second host cell is chosen 
from a T cell, B cell, NK cell, dendritic cell, macrophage, muscle cell, stem cell, skin 
cell, fat cell, blood cell, brain cell, bone marrow cell, endothelial cell, retinal cell, 
bone cell, kidney cell, pancreatic cell, liver cell, spleen cell, prostate cell, cervical cell, 
ovarian cell, breast cell, lung cell, liver cell, soft tissue cell, colorectal cell, and 
gasfrointestinal tract cell. 

58. A method of diagnosing a disease, disorder, syndrome, or condition 
chosen from cancer, proliferative, inflammatory, immune, metabolic, genetic, 
bacterial, and viral diseases, disorders, syndromes, or conditions in a patient, 
comprising: 

(a) providing the antibody of claim 35; 
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(b) allowing the antibody to contact a patient sample; and 

(c) detecting specific binding between the antibody and an antigen 
in the sample to determine whether the subject has cancer, a proliferative, 
inflammatory, immune, metabolic, genetic, bacterial, or viral disease, disorder, 
syndrome, or condition. 

59. A method of diagnosing a disease, disorder, syndrome, or condition 
chosen from cancer, proliferative, inflammatory, immune, bacterial, and viral 
diseases, disorders, syndromes, or conditions in a patient, comprising: 

(a) providing a polypeptide that specifically binds the antibody of 

claim 35; 

(b) allowing the polypeptide to contact a patient sample; and 

(c) detecting specific binding between the polypeptide and any 
interacting molecule in the sample to determine whether the subject has cancer, a 
proliferative, inflanmiatory, immune, bacterial, or viral disease, disorder, syndrome, 
or condition. 

60. A method of identifying an agent that modulates the biological activity 
of a polypeptide comprising: 

(a) providing a polypeptide or an active fi:agment thereof, wherein 
the polypeptide comprises at least one amino acid sequence encoded by SEQ ID 
NOS.: 1-104; 

(b) allowing at least one agent to contact the polypeptide; and 

(c) selecting an agent that binds the polypeptide or affects the 
biological activity of the polypeptide. 

6 1 . The method of claim 60, wherein the polypeptide is e3q)ressed on a cell 

surface. 

62. A modulator composition comprising a modulator and a 
pharmaceutically acceptable carrier, wherein the modulator is obtainable by the 
method of claim 60. 

63. The modulator composition of claim 62, wherein the modulator is an 
antibody. 

64. A method of treating a disease, disorder, syndrome, or condition in a 
subject, comprising administering the composition of any one of claims 13, 18, and 36 
to the subject. 
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65 . The method of claim 64, wherein the composition is administered in a 
maimer chosen from orally, parenterally, by implantation, by inhalation, intranasally, 
intravenously, intra-arterially, intracardiacally, subcutaneously, intraperitoneally, 
transdermally, intraventricularly, intracranially, and intrathecally. 

66. ■> The method of claim 64, wherein the disease, disorder, syndrome, or 
condition is chosen from cancer, a proliferative, inflammatory, immune, metabolic, 
genetic, bacterial, and viral disease, disorder, syndrome, or condition. 

67. . The nj.ethod of claim 64, wherein the disease is cancer. 

68. A method of treating a disease, disorder, syndrome, or condition 
chosen from cancer, proliferative, mflammatory, immune, metabolic, genetic, 
baxjterial, and viral diseases, disorders, syndromes, or conditions in a subject, 
comprising: 

(a) providing an antibody composition that comprises a first 
antibody , or fragment thereof that specifically binds to a first epitope of a first 
polypeptide or a biologically active fragment thereof, wherein the first 
polypeptide: 

(i) is encoded by the nucleic acid molecule of claim 1 ; or 

(ii) comprises the polypeptide of claim 14; and 

(b) administering the antibody composition to the subject. 

69. The method of claim 68, wherein the antibody composition ftirther 
comprises a second antibody that binds specifically to or interferes with the activity of 
a second epitope of the first polypeptide or to a first epitope of a second polypeptide. 

70. The method of claim 69, wherein the second polypeptide comprises the 
polypeptide of 14. 

71. A kit comprising the antibody of claim 35 and instructions for its use. 

72. A method of gene therapy, comprising: 

(a) providing a polynucleotide comprising a nucleic acid molecule 
encoding the antibody of claim 35; and 

(b) administering the polynucleotide to a subject. 

73. A method for prophylactic or therapeutic treatment of a subject, 
comprising: 

(a) providing a vaccine; and 

(b) administering the vaccine to the subject; 
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wherein the vaccine comprises a polynucleotide or a polypeptide 
chosen from at least one sequence according to SEQ ID NOS.: 1-104 or a 
biologically active fragment thereof. 

74. The method of claim 73, wherein the vaccine is a cancer vaccine, and 
the polypeptide is a cancer antigen. 

75 . A method of inhibiting transcription or translation of a first 
polynucleotide encoding a first polypeptide, comprising: 

(a) providing a second polynucleotide that hybridizes to the first 
polynucleotide, wherein the first polynucleotide compriises a polynucleotide sequence 
chosen from; 

(i) at least one polynucleotide sequence according to SEQ ED 
NOS.: 1-104; 

(ii) a polynucleotide encoding a polypeptide comprising an amino 
acid sequence chosen &om. at least one amino acid sequence accordmg to SEQ 
ID NOS.: 1-104; and 

(iii) a polynucleotide encoding a fragment of a polypeptide 
comprising an amino acid sequence chosen from at least one amino acid 
sequence according to SEQ ID NOS.: 1-104; and 

(b) allowing the first polynucleotide to contact the second 
polynucleotide. 

76. A method of treatmg a disease, disorder, syndrome or condition 
comprising administering a modulator to a subject, wherein the modulator binds to a 
cell surface molecule that is over-expressed in the disease, disorder, or condition, and 
is linlced to the antibody of claim 35. 

77. Tlie method of claim 76, wherein the antibody is capable of initiating 

ADCC. 

78. The method of claim 76, wherein the disease, disorder, syndrome or 
condition is cancer and the cell surface molecule is over-ejqsressed in a cancer cell. 
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SEQUENCE LISTING 
<110> FIVEPRIME THERAPEUTICS, INC. 

<120> NOVEL MOUSE POLYPEPTIDES ENCODED POLYNUCLEOTIDES AND 
METHODS OF THEIR USE 

<130> 08940.0012-00304 



<150> 60/485,217 
<151> 2003-07-08 

<150> 60/485,539 
<151> 2003-07-08 

<150> 60/476,621 
<151> 2003-06-09 

<150> 60/476,632 
<151> 2003-06-09 , 

<160> 104 

<170> Patentin version 3.2 

<210> 1 

<211> 2145 

<212> DNA 

<213> Mus nnis cuius 

<400> 1 

aaacaggtga cacaggagta gatgttgtct tagtcagggt ttctattcct gcacaaacat 60 

catgaccaag aagcagttgg ggcggtaagg gtttattcag cttacatttc cacatttctg 120 

tttatcacca aaggaagtca ggactggaac tcaagcatgt caggaagcag gagctgatgc 180 

agaggccatg gagggatgtt ccttactggc ttgcctcccc tggcttgctc agcctgctct 240 

cttatagaac ccaagactac cagcccagag atggtcccac ccacaagggg cctttccccc 3 00 

ttgatcacta attgagaaaa tacctcacag ctggatctcg tggaggcatt tccacaactg 360 

aagctccttt ctctatggta actccagctt gtgtcaagtt gacacaaaac tagtcagtac 420 

agatgtcttc atggagaaga gagggtgagg attgtaacta tctggggaga gaagccgggt 480 

gggtgggata agatgtgcga tgatcttttg tgtaacattc tcacatactc ctcaatgact 540 

tatacatgtg attccaagcc cagtccacag aatggactaa gtcatttcct ttcctgggcc 600 
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agggttctga ccgtgtatgg agagtagagt agatttaaga ttgcccatct tccctggttc 660 

tctgtggtca ggcttgctca ggactcagtg cacatgcctg tcttgggagg tgcgttagct 720 

cttccctgat gccatttctc aagctgtaga gttttcccct gtccccttga acgagccact 780 
gtcgtctggg taaggggact cccctgcctg cagcaaggca ggactattgt gttccctcct 84 0 
tagttctgtc ttccatctgt taacagtggt gcctgctgct ctttatgttc atagtctgac 900 
gagggaggct aatgacccac agtggggtcc cagggtcaaa ctgcctatgt ccttcttctg 960 

ccgttgacca tgagacttgg gctcatggcc tctctctgct tccttgggtc agtgtgggta 1020 

ggcacaggga aaggctctgg tgaggacaag gctttgctga ccattctgct gcttctctgg 1080 

gtctctctgc cttgtctgct ttctcctcct tcttctgttt cccctgtttc ctcctcctct 1140 

ggtaccttct tcccccaacc tggagagtgt gcctgggttt acagcgtgca tcccctgccc 1200 

tcactcagga acaggtgtgt ggcctctgcg ggaactttga tggcatccag aacaatgact 1260 

tcaccactag cagcctccag gtggaggaag accccgtcaa ctttgggaac tcctggaaag 1320 

tgagctcaca gtgtgctgac acgagaaagc tgtcactaga tgtttcccct gccacttgcc 1380 

acaacaacat catgaaacag acgatggtgg actcagcctg cagaatcctt accagtgacg 1440 

tcttccaggg ctgcaacagg ctggtgagac tttgtgggta gagaggaggc aaaatgatcc 1500 

aaaggcttgt gcctgccttg agtgtcagtg tccgatgtga gcaccaacta ggctggagtc 1560 

ctgcaagggt tcaccacaat agctctttct tggcttctag atgagaagag catgcatagt 1620 

ctttgaggga agcaatggct ggccaagcgc tttccttgta cgcaagaagt ctcttgctct 1680 

gaagttctct ctgcattgcc tccaatcaga gccagggaat tcctttcctt gttactgttc 1740 

tacgggtcca gacccagcca gaacctttcc aatcatcaac cacaagcaaa tcagcaaact 1800 
gcctctttag tgcccagttc ccttctcatt caaactcgtt ttctggcatt aagtccagtg 1860 
attttggagt ctgctttttt tttttttctg gcttcttttc attcctctgt ctcacacatt 1920 
tcatagcata taacatcaaa caatagtgat gaattctcca cagttaagtg gacttctttg 1980 
ggcatttgca tgcacgtgcg cacaagcatg tatgattgtg tctctacaca aatgcttatc 2040 
cattgtgcac ctgtgctgtt tcccttaaac atgattgatg ggttgagtcc atgtatctca 2100 
tttttttttt agctctccag agacttccag cccagaaaca gtcct 2145 
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<210> 2 

<211> 2412 

<212> DNA 

<213> Mus musculus 

<400> 2 

gagttcatag atctttgtgt aaaactgagt atctccatgg gcatgtaatg tttaagtctg 60 

cctagttaac agtgacaaac tttatttttg cattttgggc aatcattgac ctctgacaga 120 

acttaaggga catatttatg agcttctggg aaaggtcttc aattctactc cccatcatga 180 

cctctacagg aacgttcctt ctagccagcc aagtagcaaa agaacaggac gacatctgag 240 

ctgtcccttc cctcctccgt gggccatacc gctccggggg cttgacgcca ggggtaacct 300 

gttctcatct gatgttctct ttagagaatg gcaatggtct ctgcaatgtc ctgggccctg 360 

tacttgtgga taagtgcttg tgcgatgctg ctctgccatg ggtcactcca acacaccttc 420 

cagcagcatc acctgcaccg gccagaagga gggacctgtg aagtgatcgc ggcccacagg 480 

tgttgtaaca agaaccgcat cgaggagcgg tcacaaacag tgaagtgttc ctgtttacct 540 

gggaaagtgg ctgggacaac aagaaaccga ccttcctgtg tggatgcctc catagtaatt 600 

gggaaatggt ggtgtgagat ggagccctgc ctagaaggag aagaatgtaa gacactccct 660 

gacaattctg gatggatgtg tgctacaggc aacaagatta agactacacg aattcaccca 720 

agaacctaac agaagcattt gttatataaa taggaaaaag aacaacctgt ggaatatacg 780 

ttgtgaggat ttaaaacatc ttccatagtt gcaagccaag tggatctctt atctgcactt 840 

tggttaccag ataaccacag tgcacttact ctgatacaca gtatcccaaa agaagaagac 900 

tcgggatttt ctggcaacat caaggaaaat ggcttttaaa aaaaaatgag ttttctctgt 960 

gaaatttgga ggatcatgaa gaacgatcaa ctgtcttcta atttggaact aacattactt 1020 

tgtaccattt gaaatatata tgtatatata atattttgaa atattatata ttctcttcaa 1080 

gaaatgaaca gtaccacaat gtgaggtggc tggtgtatcc ctttcagttt tggatgtttg 1140 

gtcggttttg ttttgtttgc cattcctttt tctctcggta aggaagatac atgcccatgt 1200 

gaaaatccaa catggcactc tccctggaag gccagctgca agccgactcc tggaagctga 1260 

ggcatcctaa cagtactgag tcaagagctt ccccctgttt ctacctggtg acccaaggaa 1320 

gctccttgtc ttgatttatt gctttctatc ctgtgcaata ttagcatgca agcttggctt 1380 

acataatcat actttatatt cgattgatat ataataaccg ttctaacctc ttccaggaaa 1440 



3/186 



wo 2005/005597 



PCT/US2003/027106 



atatttttag 


aactactagc 


ttttccacag 


ataagcaaat ggggattctt 


gagggagctg 


1500 


ctccaccgtg 


ctattaagac 


tctggcagga 


tggtaggatg tagatcccta 


tattaataag 


1560 


tcctgtaaat 


acagtgtctt 


agggctttgt 


atagctgtcc tagactacag 


aagtgtcctc 


1620 


tgattaaatc 


caaagtctgc 


cattgttaac 


tccatagtgc tgtagcgaca 


cgttttatca 


1680 


tggcgcctct 


ttctatgttt 


gctttgcttt 


ttcctagagt gttcatctct 


cctctgatga 


1740 


gataggaaag 


ctatggaagc 


aattaggttt 


cccaatgatc tatgtgacca 


agtgttggac 


1800 


agccctatta 


aagtggtaaa 


taacttcttt 


cttaacccac tcgtcctctt 


tgtctgccat 


1860 


ttagttttat 


agactctctt 


ttaactaaac 


cgagagatca cgagactacg 


gaggctatta 


1920 


tttccaacaa 


taatattttt 


gaaacttaga 


aactcattaa tatgattgta 


gtaaaatatc 


1980 


caattacaat 


ttctgggatt 


ccatgtgggt 


cactctgata atatatatgg 


ggctcacaca 


2040 


cacacacaca 


cacacacaca 


cacacacaca 


ccaatgagag gtaaaaaaca 


acaaaaacct 


2100 


aaaatccatt 


atccctgtct 


ttttcgagtt 


actgagctga gaggtaagag 


gtaaacacct 


2160 


acttaaggtc 


tagttttcct 


ggggaaaatt 


ttaatgggat tttacactcc 


gatgtcatct 


2220 


cacaactgca 


gggttttttt 


tttcttccca 


aattattctc tgtcatctgt 


gtttttagag 


2280 


cctcttagta 


aatcacagca 


ctgtagtccc 


taagtctgta ctttttagga 


tcaaacttca 


2340 


agtaaaactc 


aagattacct 


cttattatac 


ccagaaagcc tgaagtttaa 


ttgaatgtgt 


2400 


gaagttctaa 


CO 








2412 



<210> 3 

<211> 2627 

<212> DNA 

<213> Mus muBCulus 

<400> 3 



gctttttttt tttttttttg 


gcttgtattc 


tggacttaaa gtcacagccc 


tgccctgaac 


60 


ttcactagtg ccagcatctc 


gtccaaattc 


tgctccccca ccccttcctc 


accttcctcc 


120 


attcctgccc ccaaccgggc 


attctgcttt 


cactcatcac tgccaccttt 


ttgtttctct 


180 


tccccgattt agattttcgt 


gccagacgtg 


cagagcagcc atctcatttg 


gtaaacacta 


240 


gatgaacact tcttaaatag 


aagcaacatc 


ctttccagct agctttgtta 


aaggggcaga 


300 


gaccgttcag cccatcaaac 


tgatggatta 


aaaaaaaaag taacatgtaa 


gttagaaaaa 


360 


tacttcagga aagatgtgcc 


atcattatca 


gttcctcaag ataatcaatt 


agaaggaaac 


420 
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tcattgacca ctaagtgtgg aactcttccg aaagcagcag gtgaagggga agcaactcct 4 80 

ggagcccacc tgccatcaag cctgttccca atgctgaacc tacacaagag gcaatagaac 540 

cctatttgag ggtcctgact tttctgttca gtatctgcca tagatgggct acatcaaagg 600 

aaagattggc tagcaaacca gtccaaatct atgacccttc ttttaaaatg gatcatgagg 660 

gccttaaacc atgggtcaac tgcaaagtgt cataataact ttgcttccca agtttcttta 720 

tgaatttgcc tggaagtggc acacctttga agaacttgtg tgtgtgtgtg tgtgtgtgtg 780 

tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tatttatgtg tttgttttct ctgtggaatc 840 

ttgaacatct accccttttc taccttagtt tatttaccta tcaaatgagg ttgaatatat 900 

gaaaacaatt attaaattgt ggaaattcat gtgaaattta ctaagaaatg acaacaacgc 960 

tgtgtataca gcaaacactc aataaatttt aataattatt tcagtaaaac cacaattact 1020 

gtataaggat tgaaagacat tacagacaat atcatgcctg cccctgagaa agattcgcaa 1080 

taagcactgt actaaacgca tacatttttt tataatcact tgatagcaaa ggtagtttcc 1140 

actggaaaag tgcaattaag agctcctaaa acaaagcaaa actgtgtggt gtgtatgtgt 1200 

tcatacagtt tctaaagaca gaaaattcta aaactgactc cccataggct gctactccaa 1260 

gaagctgtag aactgccatt aggtctaagt cttgctttca gaaaaatcca tggaaagtgt 1320 

tagcagagaa caatccaggg gaagttactg tagcattcag aaacctggcc aggtggcttt 1380 

caaggtctga gaatgtgagg gaggaggtat ccaaaagtag tctctaaaag agccacaggt 1440 

catgggcagc tccacccagt tgatcccctt atctttgctg gctcagctac ttgtccttgg 1500 

cagatattcc ggcaatgcac tctgttgaac aagagtgaat gggagaactt taaacagggc 1560 

aaaaagaaag cttagcaacg gagggaaaag tgagtatttt ctaaccaaag tcactgcctc 1620 

taactagatt ttatagtttg ggttaaaaaa tttaaatctc tcttaagctt taggtgggtg 1680 

ttctctctct ctctctctct ctctctctct ctctctctct ctctctctct ctctctctct 1740 

ctctctctag ttaatatatt atttacttag caaatgtttg ttgaccattt atcctgccac 1800 

atgtcaggat ctgctatagg aaagaagata acaattccaa ccttgtatga ctgttgtaaa 1860 

gagaacatga gtttgtatat agaacatgct tgatacaatc attagtaccc ataccattgg 1920 

ctacaatcat tgagacaatc attaatacca tcggctagtt aatgattatt atactttagc 1980 

ctgcatgact ctcaccatag agcttaacaa ttatggattt ttcctggagt aacacccata 2040 
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gatactccag 


ttctgaaggt 


ataaaagaaa gtgacafcagg aafcgtgcatfc tfcfcaatggct 




gtaataactg 


actgtgtggc 


ccatgaaatg gaacttcaag agaagtggac tttttcttca 




tgttcttccc 


aatttccaaa 


tgaagattca aagcctggat tcaaaggtct cttgactcta 




ccggcctctc 


aatgctacat 


tcaatgtctt gattgataag aacaatgtgc atatgaccag 




gaatgatagt 


aaggtaaaaa 


tgtgtttgta tttgcccatt cccaagttac tttatgatac 


2340 


ctaaaggagg 


ccatgattga 


cagatcatca ttatcccttg aataaatgtt ctagcataga 


2400 


gtacattgca 


tgtggcttta 


agatttgctt agacaaaagc agataccata catctataat 


2460 


cctgtgtctt 


gttttattct 


aataatagtc tccaagtgct tctcaattca actcacctat 


2520 


gttactcttc 


ccctgattgc 


agtgttcctg atggctgtta tctttcccga tttagtcact 


2580 


tcctcttaat 


agacactaag 


agtatctttc agaagctcct gagagac 


2627 



<210> 4 

<211> 3153 

<212> UNA 

<213> Mus imisculus 

<220> 

<221> modif ied_base 
<222> (1410) . . (1410) 
<223> a, c, t, g, unknown or other 

<400> 4 



cccccatcct tgcctaaact 


ctagatatgg 


tctctacatg ttctatctcc cctttgttgg 


60 


gtatttcagc ctatctcatc 


cccctttggt 


cctgggagcc tcttgctttc ctggcatctg 


120 


ggaattgctg gtggctatat 


ctagttccca 


atcccccatt gctactaaat accactgttc 


180 


aatttcctgg cactctgtat 


atcacccctg 


cctcctccca tacctgatcc catccccaat 


240 


tctcccttcc cttcctcttt 


tcctccgaag 


tctctcccac actacttcac aagagtattt 


300 


tgttccccct tctacacatt 


tcagtgttcg 


ttcttcttga cctacatatg gtctatgaat 


360 


tgtaatttgg gtattccaag 


cttttgaaca 


aatatcaacc tatcaatcag tgcataccat 


420 


gtgtgctgtt ttgagactgg 


gtcacttcac 


tcaggatatt ttctagttcc atccatttgc 


480 


ctaagaattt tatgaagtca 


ttattttaat 


tagctgagaa gtattacatt gtggaaaggt 


540 


actacatttt ctatatccat 


tcctctgttg 


aaggacacct gggttctttc cagcatctgg 


600 


atattataaa taaggctgct 


atgaacatag 


tagaacattt gttcttgtta tatgtctttt 


660 
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gggtatatgc cctgtagtag tatagttgta tcttcaggtc caattttctg aattcaatta 720 

tctgagtaac tgccagactg atttccagag tggttgtact aggttgcaat cctaccaaca 7 80 

gtagaggaag gttcttcctt ctccacatcc tcaccagcat cttctatgtc cagaattttc 840 

gatcttagcc attatgacta ttgtaaggta gaatctcagg gttgttttga tttgcatttt 900 

cccaatgatt aaggatgatg aacatttctt taggtgcttc tcagccactt gagactcctc 960 

atttgagatt tttgtttgtt tgttttagtt ctgtactcca tttataatat ggttatttgg 1020 

ttctttggaa tctaattggt tgagttgttt ggatattagc ctgctattgg atataggttt 1080 

ggtaaagatc ttttaccaat ctgtaggttg ccattttatc ctgttgaccg tgttctttgc 1140 

ctaaataaaa taaaataaat aaataaaaat aaacttttca attttatgag atcccatttg 1200 

tctatagttg aacttacagc ctgagccatg ggtgtactat tcaggaaatt ttcacatgtg 1260 

ccattgtgtt caaggctctt tcccactttc ttttctattt gattcagtgt gtctggtttt 1320 

atatggaggt ccttgatcca ctgggactag aggttagtca atgagatagg aattgatcag 1380 

tttgcatatg tctacatggc atgaccatan ggtgtgtggg tttaattccg ggttttcaat 1440 

ttcattccat tcatctacct gtatgtcttt gaaccaatat catgaggttt ttttgttttt 1500 

tgtttttttg tttttatcac catcacactg tagtgcagtt tgtggtcaag agtgtttatt 1560 

ctcccaggag ttctcttatt ggtgggaata agaatagttt tcatatcctg cgtttctgct 1620 

atacagataa tttagaatgc tcttctatat ctgtgagaat gagttgcatt tgatggggat 1680 

tgattgatct ggtattgctt ttggtagatg ccattgtttt atgtaatgct gcaatcatga 1740 

gaatggagat ctttctgtct tctgagagct ttttcaattt ctttctttag agacttgaac 1800 

ttctttactt acagatcttt cacttgcttg gttagagtca taccaagata tttcatatta 1860 

tttgtgacta ttttgaagga tttcatttcc gtaatttctt cctcagacta tttatccttc 1920 

gagtagagga aggctactga ttgaaaaatt ttatatccag ccactttgct gaagttgttt 1980 

atctgctata ggaattctct gatgcaattt ttggagtcac ttaagtacaa tatcatataa 2040 

cctgcacaga attatatcat gacttcttcc tttctgatat gtgtcatttt gaactcattt 2100 

ggtggctaat tgccctgggt agaacttcca gtacaatact gcatgtgtag ggagagagta 2160 

ggcagccttg tttaatccta gattttagtg tcattgcttc aagtttcact ccattttgtt 2220 

tgatgctggc tattggtttg ttgtgtatgg taagtatgtt taggtatggg cctttaatta 2280 
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ct^afcatttc 


caagactttt 


tacatgaagg gttgttgtat 


tttgtcaaat gatttttcag 






tttgaccatg 


tgggtttttt tctttgagtt 


cctttatata gtagattata 




ttgatacatt 


tccatatatt 


gaatcatccc tacatccctg agataaagtc tacttgatca 






tcgttttgat 


gtgttcttgg atttggtttg tgagaatttt attgaggttt 




ttattgcatg 


aatattcata 


agcaaaattg gtctgaagtt 


ttcctgcttt gttcagttgt 




tgtgtgtttt 


tggtatcagc 


ataactgtac cttcataaaa 


cgaatggggt ggtgtccctt 




ctgtttctat 


tttgtgaaat 


aatttgaata gtattggtat 


taggtcttct ttgaaggtct 




gatagaattc 


tacactaaaa 


taatctgctc ggatttttgt 


tgttgtttgg agatgtttaa 




tgactgcttc 


tatttcttta 


gggtttatga gactatttag atgatttatc ttactctgat 




ttcaatttgg 


tagctggcat 


ctgtatagaa attgtccatt 


ttatccatat tttcaagttt 


2880 


tcttgagtat 


agtcttttgt 


agtaggatct catgattttt 


taaattttct ctgtgcccta 


2940 


tagttagttt 


gactaaaggt 


ttatctattt tgttcatttt 


ctcaaaaaca acaacaacaa 


3000 


caacaacaaa 


aaacagctcc 


tagttgtgtt aattctttat 


ataattctct gtaggcttat 


3060 


tgattgtgtt 


ctgctccttg 


gggaagaacg atacatgagt 


tgaccatgct tacaaaagat 


3120 


tgtctctggc 


tgggtgtggg 


aaatgcctta agg 




3153 



<210> 5 

<211> 2900 

<212> DNA 

<213> Mus mus cuius 

<400> 5 

gagtatcaaa ggcatgaacc accatacact gcctccaaac ctgcttgaac atgaagcctc 60 

agtccgtcgt ccaagaagtg atttcgcata cacaacaaac acgtcaacgc ccccacctca 120 

gcacacagag gaaaatgcca aacaggttac ttacgaggag tcttcgtttg ccttcgaaag 180 

accccaacag gtcggccact gacttcttgg gacttggagc gagatgtttt gcggtgggct 24 0 

tctgattgtc atcctgctct gtcttacccg cctttttctt cttattcttg tctttctcct 300 

gtttggtctt tttctcagct ttcttcgcct tgcttttcac tcttgagtaa cttgtcttgt 360 

tttgcactct tccctttttt cttcttctcc ttctcaggtt tctcgagctt ttcaagcttt 420 

ttcgactcct tggcattctg cgagggaacg ctatccacct gccgctcctc ctccacctga 480 
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gaggaggtgg gctggctgag gtcatacttg tcttcatatt cgttgctgag aattctgtcc 540 

ttctttttgg gcagcttccc ctttactggc ttgtgaggac ctggcaccgc atttgggtcc 600 

tgatggccat gttcccgttt gtctgtccgg ttgtcccgga aacggcttgt gcccacaact 660 

cttgcgctga cctcccagag ggtgggaggt ggggtggctg tgaagcttcc atagttggtg 720 

gctttgctgg gcctcctggt ggcctgtggc ttctctctct gttgctcctt ccgaggtaga 780 

gggtatagct gctctgagac ggaggggccc ctggcagtgg tcacctctgc tgttgcaggg 840 

ggcctgtggg agactgagaa gggatgcaac cgggatgtcc agggcctctg ggtagctgga 900 

taggcagtgg tggttgtagg tcttgcggct attgtcacta cccgggatgt ggcccgagtg 960 

gctgtcgtga ctggagcagg aggaagggtg gtggcccttg gagttggggg aggctgagga 1020 

gtggctgcct tgctggtggc ctgctttctc ggagcctctc tggtggggtg gacttgtgtt 1080 

ctccttgggt cctctttcct cttatcaccg cccaggcctg tacctcctgc tcctccaccg 1140 

ccctcgtttc cttcctggac tacatggccc tctatgcccg aggccttaca cttttggaca 1200 

aaacccttct gcctgatttt ctcgatcctg cggatgggac cttgatcaat gacctcatac 1260 

atggcttcca gcctgaccgg gtaagggtag cgctcctcca cctgcagagt cttttttaac 1320 

agcaccatgc taaacttgcc cttctccagt ttcaagaagc tcatcagctt tgggatgaga 1380 

ttggggtcca gaggctgctc caggatctgc ccttcattgg tgatcctccg taccttgccc 1440 

ccttcctcgc ctgcctggtg gaagagcaca atctgttgga tgtgcctttc tgccagctca 1500 

caatacacgt cgtccttcag gaggctcatc atgaggcggt agtagccctc tgaggcgtga 1560 

ggggctgaga tgacccatac cctgttcttt cctgcaaagc tggccaggat attgggagag 1620 

ctagatccag aggggaatcg caacattctt gtccgagcag aagacccctc atctcggacc 1680 

atctcacgcg ccgagctcct agctatgggt ctcacctcag gtctgactgg aactccattg 1740 

atacctgagg ggggtggtct catggtcggg tgagctaatc tcaacactgg tacactcttc 1800 

ctcctttgga aaggctgagg atttggttct tcctgggtgg atttctcaac tccaccagac 1860 

ctcccggtgt gcctcagata ccgagctgac ctactgctga ttggagaaac caaaggtact 1920 

ttccgtccag tgtggctgtc gctatccaag gcaggagaat gagatgctga tccacacacc 1980 

agccacatgg ccaagagcgt ggtgaagtgg ggtcccattt tccacatcat tgtattatcc 2040 

acttggggac gcagaggggg tataatacaa aaatgaaaat aacataaaat gagaagggag 2100 
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aaaaagaaaa gaaaaaaagt cattaattgc aagcagagct gggcatgatc tctttctcta 2160 

acttggccaa aaagaaaggg caggagatag ttaaagaaga aaaaggaggg caagcagagg 2220 

agggcaaaca gagagagggc ttctcttaaa ctgctgcgat tgtcagtgct tagcagagtc 2280 

agtctttaag caggaaatcc agctttgatt tacaggcggc aggcatgagg cacgcctgct 2340 

ctgcctcagc tgtgcccaga tgcaccagag actgtgtaaa ctgagcatgc taattatgaa 2400 

ttctaactgt gaatgtgcat tcacacttca tgtcttcaaa caagcaacaa ggaaacaaac 2460 

accagagagc aaggttggca cacttacaca gtggttcatg gagatgcttg cagaggcttt 2520 

ccagagaaac gtgaaggttg gtccccagga cagtttccta ggtgagatcg gttccttcct 2580 

gctctctgtc tccttgtctg tctgcaaagc agccccaggg agcccagcct gggccgattc 2640 

cttttgctgg ctgtgtcttt taccttctgg tccagtggga ggaggtgaac gctgctgtcc 2700 

ttgggaccgt ttgtatctcc atttgtctgc agtttctcct gggtctgtgt gtgccggtcc 2760 

ttggttgcca cagggatctt cctgaaagtg aagtcctagt gatttctgaa tctcgctctt 2820 

cccttctgtg gagttttccc agaacttctt atcacccttc ccctcctgca gtctggccga 2880 

gtccctttgt ttccgagcgt 2900 

<210> 6 

<211> 1852 

<212> DNA 

<213> Mus musculus 

<400> 6 

atcttctttg atttctttct taagagactt gaagttctta tcatacaaat ctttcacttc 60 

cttagttaga gtcacaccaa ggtattttat attatttatg acctttttta tttttttgtt 120 

tttcgtgaca gggtttctct gtatagctct ggctgtcctg gaactcactt tttagaccag 180 

gctggcctca cactcagaaa tccgcctcct ctgcctctcg agtgctgaga ttaaaggtgt 240 

gcgctatcac acccggctct ctttatgact attttgaaga gtgttgtttc cctaattttt 300 

ttctcagcct gtttatcctt tgtgtagaga aaggccactg atttgtttga gttaatttta 360 

tatccaacta cttcactgaa gttgtttagg agttctctga tagaattttg gggtgactta 420 

aatatactat catatcatct gcaaatagtg atattttgac ttcttccttt ccgatttgta 480 

ttcctttgat ctccttttgt tgtctaattg ctctagctag gacttcaagt actatattga 540 

ataggtaggg aaagagtggg cagccttgtt tagtccctga ttttagaggg gttgcttcaa 600 
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gtttctctcc atttagtctg atgttggcta ctggtttgct gtatatggct tttattatga 660 

ttaggtatgg gccttgaatt cctgatcttt cccagacttt tatcatcaac ggaggttgga 720 

atttgtcaaa ttctttctca gcgtcccaag agatgatcat gtggtttttg tctttgagtt 780 

tgtttatcta ggtggattat gttgatagat ttctgtaaat tgaaccatcc ccgcatccct 840 

gggatgaaac ccacttgatc atgatagatg atcgttttga cgtgttcttg gatttggttt 900 

tcgagaattt tattgagtat ttttgtattg atattcataa gggaaattgg tctgaagttc 960 

tctttctttg ttgggtgtgg tttagatatc agagtagttg tgacttcata gaacaaactg 1020 

ggtagagtac cttctgtttc tattttatgg tatagtttga ggagtattag aattaggtcc 1080 

tttttgaagg tctgatagag ctctgcacta aacccatcag gttctgggct tttttttggt 1140 

tgggagacta ttaatgactg tttctatttc tttaggggat atgggactgt ttagatcatt 1200 

aatctgatcc tgattttact ttggcacctg gtatctgtct agaaacttgt ccatttcatc 1260 

caggttttcc agttttgttg agtataggct ttcgtagtag gatctgatga ttttttggat 1320 

ttcctcagat tctgttgtta tgtctccttt ttcatttctg attttgttaa ttagtatact 13 80 

gtccctgtgc cctctagtta gtctggctaa gggtttatct atcttgttga tttttctcaa 1440 

agaatcagct tctggtatgg ttgattcttt gaatagttct ttttgtttct atttggttga 1500 

tttcagccct gagtttgatt atttggtgcc ttctactcct cttgggtgag tttgcttcct 1560 

tttgttctag agcttttagg tttgctgtca agctcctcgt gtatgctctc tccagattct 1620 

ttttgaaggc actcagagct atgattctaa cttttaggga ctgcttcatt gtgtttcata 1680 

agtttggata tgttgtggcc ccttctcatt aaactctaaa aagtctttaa cttctctctt 1740 

tataaccagt cttccaaaca aaggaaagat aaggaatatc tgactgtgct cttctttcca 1800 

ggggggcatt cacactaata tgtaggctta agtgatttgc ttctttaaaa gg 1852 

<210> 7 

<211> 2417 

<212> DNA 

<213> Mus musculus 

<400> 7 

ggtgaaggtt gccctaggat gtctaagaac atggctaaac aagccatggg gaggaagcca 60 

ctgagcaacg gtccttcaga gccattgctc tgttcctgcc tccaggttcc tgctctggct 120 
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tcttccttca 


gtgatgcccc 


gtgaactgta agatgaaacg 


aaccctttcc 


ttccctgatt 


180 


gcggttgatc 


atcgtgttct 


ttgcagcaac agaaggcaaa 


tgaggacacg 


gaccattagg 


240 


tctaactagc 


cccttctacg 


tgtatttgga ggcccgaccc 


ctagacaatt 


acagtaccca 


300 


tcacattggt 


gacactggcc 


accatggcta ggcatgacag 


ggaagcttca 


ggactcccct 


360 


ttgccagggt 


gttctgagtc 


tggaacaagc ttctccttaa 


ctcaaaggga 


gttcacgctg 


420 


ctgagttagc 


ctggagacat 


ttggggtctg tggacagtga 


agatggatgg 


tgtctacagg 


480 


atggtggtgg 


tggtggtgac 


agcaactgac atgagctggg 


tgtgggactc 


aggcattact 


540 


tggatcagct 


cattcagatc 


accacagctc tagcaagaca 


gtggcaacat 


tgccctcttc 


600 


tttagatggg 


agactgtgtt 


gctaaaactc tcatcagact 


gttgatggta 


gcatccggct 


660 


ttgctcacct 


ctgcaagtga 


ccatgaccag gggatgagat 


ctctctactg 


acaagcaagc 


720 


aagtcacggg 


cacttgtggc 


atacaatatg gcccctgttc 


ttctgttact 


cttcttatta 


780 


gataaaaagt 


gtgaatgttt 


gtgtgtacat gttcatatgc 


atgtgtgtga 


gcatgtatgt 


840 


atgtgtgagc 


atgtatgtgt 


gtgtgagcat gcatgtatgt 


gtgagcatgt 


atgtgtgtgt 


900 


gagtgtgtgt 


tactcttctt 


attaaaataa gaatgtaaat 


aaaatgtgtg 


tacatgttca 


960 


tatgtatgtg 


ttgtgagcat 


gagcatgtgt gtgtgtgtgt 


gtgtgtgtgt 


gtgagagaga 


1020 


gagagagaga 


gagagagaga 


gagagagaga gagagcattt 


gtacaggtga 


aatgtcaagt 


1080 


gtcttccttg 


atctcctccc 


atcttgtttt ttgagatgaa 


atctctcact 


gacctggagc 


1140 


tcctagactg 


tgtagactga 


caaaccaacg atgctcacgc 


cattgcctcc 


ttagcaggga 


1200 


atcatggatg 


tgcaagacca 


cacatgtcct ttttcatgga 


ttctggggac 


tcagtcccag 


1260 


gttctatatt 


tgtgcagcaa 


acattttatc agctatgcca 


tctccccgga 


cttcgttgtc 


1320 


atttattgta 


atgagtgatg 


tgagctaatg gtgtgccctg 


attcagatct 


cccgggttaa 


1380 


cgtgtaacgt 


ggaccactgt 


tcaaggcata tggggcagaa 


gttctgatcc 


attttaggtt 


1440 


ttccagagtg 


gcaggagagt 


taggtaaaag ggtcagcaat 


gacaaatgta 


gagtgccaag 


1500 


tgcttcatct 


acatgcagac 


aaagatatct gttcccccaa 


gacacctggg 


ttgtcactga 


1560 


gaaaatatcc 


acaaggcaaa 


acaatgccac gtgggcaaac 


ttacctacct 


tcagccttgc 


1620 


tattggagcc 


attcactggc 


tgctcaccag aaacacagga 


acggagttct 


taccctgtgg 


1680 


ggtactggga 


ccagggataa 


gattcctttt atttgtctgg 


acatgaattt 


aagctacaga 


1740 
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acctcacaac ccattacaca ttgaagaact cactatatcc agccaccagc catctatcag 1800 

gtgaccactg agaggttcca ggacccaaga aaaccacgag tgtgtgagcc tcctccctcc 1860 

acagtgtcgg agcctccccc tatgtcctat gctcccgctg gatacatcag cacatgcccc 1920 

accctggggc ctggcccacc ctttggccac cgtgagggca caaatgagac ttggagtcat 1980 

tccagcacct ctgctctgga gcggccgtgg aaggaggagg tttgttgttg atgaatcaat 2040 

agtagtgagt gactctcatt ccacggcagg aattccaggg atacaaacac acatctcaga 2100 

ttaaagaaca tgtgctagag catgaagtag cttatcacaa ccatttaccg tctccagcgc 2160 

agacagaatg acaggagaga ctaggcagag gtcacagggg gtacatttcg ccatcaaaag 2220 

ccgtatcgac tggagtccat tatgacaacc tccgagtgaa aaggtggaca ggctccggag 2280 

cattagtgag cttcccaaat acaagaaaat tgatccgtga ccaattaatc ctggcagctt 2340 

tcattttcga tttcctttct gattatacgg aaccattaaa atgaacccaa atagaaagga 2400 

atgagatggg acatcgg 2417 

<210> 8 

<211> 1298 

<212> DNA 

<213> Mus musculus 

<400> B 

gactgtggcc aggtgcagcc aagccactct tgccggcgct cgtatcctta gttaggtggc 60 

gaaatgctga tgggctctcc tattcggggg aatgcggtag tgatgtataa ggtatgaaag 120 

atgtggcctg cagagagccg gcactatctc tactgctgct gctgctgcca ccgctgcctt 180 

tgctgccgcc actactgccg ccactagagc tgccgggagt gagcctccct tcttccagat 240 

acccagctcc gttgcatctc ccgctccctg gccccgcacg gacctccccg ccccagggaa 300 

caaaagcaga gtccgggtcg cccctgttct gcagccagag gatgatgtgt cgagagggct 360 

gagcagagtg tggtcttcag cggctggaag ctgctccctg ccctttctcc atgagtcctg 420 

actcgtggcc ctggcaatct cagaagagct tcactgaatt cccttccgac cagaacctgg 480 

gaactcactc cccccagcat cacccctgtg cagcctggac acactgactg aaccaacttt 540 

taggacattt cattcagtaa gacatggtgg tgaaccaagg ctctaaatct tcatgattga 600 

aggactctta cacaaggcaa agatcaagat gctctttcac tcagtgttgg atctccactg 660 

tttctgcaca tctcatctga catcatctca cctcttagaa tgagatggca tttgctccct 720 
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aaagagtttg atggctcaag catggtggat taggcctttt ctgctgggat gctatgactc 7 80 

catttacatg tatgtgtctc tatcactatc catggccctc tgaaatgggg aaatacatct 840 

tttctattgt tttcctgatt gttaacacag tacccaacac atagtgaagt gtaggatatc 900 

tgataaacaa atctatcaat agatttggca aacggggtag ccctgtaata ttctatggaa 960 

agagtaaaat atatgcaaat gtggacagat ggtgtctcct ctctcaattt aaaatgacaa 1020 

agtgtattat tagtaaagca caaagaggaa aaagactgta cgtgggagtg ctgtaaagaa 1080 

actggctgtt gaagagatgc aaagacttca atagatttat tcatgcgctt acagataggc 1140 

agctttttag tccatggaca tttcatcatc ctgtacaatt gctattgaca tgccaatgag 1200 

ttccacattt ttctgtccat ccttactgaa ctcagggctc ctcttgcatt ctcaaacatc 1260 

ttctggatat attctggcag attaagtttc aaaaatcc 1298 

<210> 9 

<211> 4319 

<212> DNA 

<213> Mus mus cuius 

<400> 9 

acacagtttc gaagaagccg gcggtgcccg caaaggctgg agtagcgaga acgactccca 60 
gtctagcggc tgcgcgctcc tcaacctgca gaccgaaagc tcgggaggac ccggcctagc 120 
cccgcagaac tggtgggtgc tggaccagag catctaggtg ctctaagccg gagactccag 
cttagccaga gcccctgctc tcctcgggcg caacaacttt gacgatctat cgcggacaga 
gctcaaaacc agacaacacc taagctggcc actccctcga agagcccgat ttggagagtc 
tagatgccaa gctcagaagc agcgaggacc tcaggccaga aagttgaacc cgggccactg 
ctgtgaagca gcttcctctt agccactcca ggggtctggt ctctctccat tctgaacaac 420 
cgccatattg ttcatcagac ggaacaagtc tgagaccagc agatgaacag gcagaggcct 480 
gaagacccca ggaacagctc tgccttcctt gttccttgat ccctctctgt caatcatcat 540 
atcctacagt gacatcccgc cccaccgaaa gctgcctgga gacaagaccg atggagaggc 600 
ccaccagcaa ctggagcgca ggcagctggg tgcttgcact gtgcctcgcc tggctgtgga 660 
cgtgtccggc ctctgcttcc ttgcagcctc caacatccgc agtcctagtg aagcagggca 



180 
240 
300 
360 



720 



cctgcgaggt gattgctgca catcgctgct gcaaccggaa ccgcattgag gagcgctccc 780 
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agacggtcaa atgctcctgc ctgtccggcc aggtggctgg caccaccaga gcaaagccct 840 

cctgcgtgga cgcctccatc gtcctgcaga agtggtggtg tcagatggag ccctgcctgc 900 

tgggagagga gtgtaaggtg ctcccggacc tgtcagggtg gagctgcagc agtggacaca 960 

aggtcaaaac caccaaggtc acacggtaac tctcggaggt catggcttag gtaggacagc 1020 

cttgactgag ctcgggactg aagaaaggcc tggtcaccag acagcagata agaggactta 1080 

cctggacatg tgcccatgtc aagtgtaaca tagaccggcc agggcccctg ggcagcactc 1140 

tgttcagcta aaatgcttgg atctttggcc acacacttga gagacctgtg ctctcctatg 1200 

aacaaagtca acacacaaac ctatccttaa ggaatatctt gtcaagttaa ggagtggatg 1260 

acaggcaatt tcaacagttt aaagtgtttg gtaccgtggc ttctgagcga cctgcagtgg 1320 

gtgtggtggg gtggggcgga tggaatttac acagatcttc agccacccga tcccaggaca 1380 

caaagttat;g agggccacac cagtatcttc ccatcttggc cttcccatca aacctgcatc 1440 

cccagcaaga tgctaagatg tgagaggaaa acactgcccc ctggtgccaa gccagtggga 1500 

tctcttttgg acagcactgg aaagtagaga tgaagggctt cttccgcaca cctctgcaga 1560 

ggcaggggcg gggcgcacat occatgctgt ccatggatgg acaggacctc agtcagaatc 1620 

acccctccac cttgaacttc gggtgtttgg ttgctgtttt aaaggacgag gaaaccgagt 1680 

cacagaggat gagatggatt tgcccaggat ctcacagctt ggccttccaa aaacacggat 1740 

gttaggactc cagcctaggc cttccgtggc agttttatct agacagtgag accccgcaaa 1800 

cactgtctgg atagcaccaa catgattctt gggagatggg actttgacca ctggaatttt 1860 

ggttgaggcc actgatctcc "catcaaacaa acaaacaaac aaaccctaga atatacacac 1920 

acacacacac acacagcaga gtggagctcc ttctcagtcc cccaaggcca aacaggcccc 1980 

gtgacagccc agaggtcgct cagcccagcg gcagcagaag cagtgtgagc agttaggtat 2040 

cccatcaccc actctgcttc tctatcagga gcttcagcca gccccctaca taggtacctg 2100 

ccaggaggca gaggggcagg ccagaggact tcataaccag gctggcctcc tagatggctt 2160 

gggagtagca agctggctta cttttattac caaggcctta agtgttagag tgagtgttag 2220 

agaccagcct ggatgatggt tccctgagaa ggtaaatcag ccatagttta cactggaatc 2280 

ttggtgccct tctccttgcc tgtgtcccaa tgcatttgac taagactctg gtatcacccc 2340 

acctcagggc atgttttggg tttggcaaaa tgagatgaac agtaacgtta tctttcaaag 2400 
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gcaggcctct gtcagtctct gcttgtagca cggtgacagg ccagggtcac caagggtcaa 2460 

tggcagtcat ggtctccttc agtactgagg agtggaggcc agagggtgtg ttaagtctcc 2520 

gtgacatatt ctttgttttg ttgttttgct tctttttgtc tttaaaaaca ttatagactt 2580 

gccaatgcca tctttagctc tagggaggga aggaacacat tgttgcccat gtcaggtctg 2640 

tgacccttcc caggggacgt cgaaggcctg ggatggcctc ccagcctctt ggctcaccct 27 00 

ctgtaccatg gaaacctgag cggcagggct ggggctgggt ttgttttcct cctgatgctc 2760 

agagagagat gtgtgagtga attatttagg ttacggcttc agcagagcca gacagtgaaa 2820 

gcctgctttc ctggaggcaa aaggtagttc tgtcgggggg tgtggagcgc catagactct 2880 

ctgctcactc tgtcctggac actgtcaccc aacggtcggc tcccctgcct tgccccacct 2940 

gtgttgtcca cacgaactgg gttagagtgg agagttcaga gatgacgcgt cacaagcctc 3000 

cgagagcggc cagtctcctc cagctccccc ccaccccccg ctgcctcggg gtcctgattc 3060 

tgtaccttct ctgaccccca ctaaagttgg ctaattcctt gttaagtctc cctccctgcc 3120 

tcctgaccta gccatgcatt tagtgggtga actctgagct gagtgggtat ctgtagattt 3180 

ccagggaagc cccacacaaa aagctctggc aaagaaaaga tccagacccc cacctccaac 3240 

tccacccccc cccaaaattt gtatccagct gaatccaaac ctgtgtaggt tttgtagcta 3300 

tgcaaagaac agtctagaaa taaatggagg gttttttggt ttggttttgg gggggggttt 3360 

agagcatagg ggttcctcca gtcatttgac ttcctctgcc ctccttgacg agaacatctc 3420 

atcatctgag gtttgctgga accccatctc ttcagctatg cgggctctag gaaagccagc 3480 

aactactaaa tcatgtgact gatttctgct aacctcctgg cagttgtact gagtctgtga 3540 

gtgacatcct gattaatgat cccagacagc tccacaacat tgacttgcac attgagacga 3600 

cagctcccta aacccagaat tatagttgca agagggttta gaagccaact ctttgatttg 3660 

ccagcctggg agagggtagc tcagaaaggt taagtaattt gcccgaagtc acacagcaag 372 0 

ctggtggcat gtccttgctc agtgactcag catagagttc tttccactat tgtacatccc 3780 

attgtacccc agagcagatg agaacttgga aggagaacag ggttcttaac tgtgagagtc 3840 

tcattctgga atcgcccaac actgtgagag caagggccag agggaagaag agcaggaaac 3900 

acatagtgat cacatacaca tagggaagca agaaggggag gggtggatgc ccagaatttc 3960 

aaactacaga gtcaaaccga acaggacaaa tgccacacga aggagaactg gtgcctgtcc 4020 
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tttcctgagg aggcaacatg gagtgttgtc agtccttggc aaactgggtg tgggggtgca 4080 

tgcctatatc ccaggtactc aggaagcaat aaggaggatg gcaagatgga ggccaggctg 4140 

gacctaacag agaccctgtc aaaagaggaa caacaataac aacagaaaaa gaatggaaac 4200 

ccttgcagaa ccttgatgct gtcttgatca ccccaggttc tttttgtcct atgcagagga 4260 

ttgcaaacta aagcacatag gttacttttt ttttaacaaa taaagtttta ttgaaacat 4319 

<210> 10 

<211> 3423 

<212> DNA 

<213> Mus niusculus 

<400> 10 

ggctagactg aaacccccag aggctggccc ctaatggccc aagtctgccc tacttctcaa 60 
aaataggcta catcccacag cctgtcaaaa tagtgccacc tactggggac cacatgttca 120 
aacctaggcg cctgtgagga gtatttttgc ctttaatggg taacaattct caacttgttc 180 
ccacaccaca aacatagctt atgtgagatt tggtgagcac ctgagcacag ttccagcctg 240 
ctagttgatc acttaaatgc agcagccaaa gtccaagagc acacactgtc tcttgaggcc 
agaaccccag cctcttggag catcaccccc cggtgccatg tttcttcccc ccacccccac 
cccgaatatt tcttcagcct ttaatgttct ctcgtgttct cctcagctat taaatcctgc 420 
aggtagataa tattcatctc tacttatcag gactctgtaa gtctatctca tgactacact 480 
ttaaaaaaaa aagatttatt tattttagta tgtgagtaca ttgtacctgt cttcagacat 540 
accagaagag ggtatcggat cccattacag atggttgtga gccaccatgt ggttgctggg 
aattgaactc aggacctctg gagagcagtc agtgctctta accactgagc oatctctcca 
gcccccttat gactacacct taactgagtc tttcattcac tcactaaatc tcttttccct 
tatggcccac acggttgttt tttccccagg tatctgtgta aattttactt aagttctaga 
cagctttgtt gttttgcatg gtgtttataa ctggcatctt caatctagat ccagttgttt 
gtactgaaca aatattttga ctttcaacac aatgtcttac gttgttgcta tgataaatgt 900 
ttagtgttct gcataatatg cccaatagtg gagagatctg atgagagagg agagaagaaa 960 
gaggctcatt tcaatgatct gctgttgatg ttttgaaatg tcctagcagg acttggtgtt 1020 
ctgttcttcg actgagggaa aagcagttgt tagcagtcac ctgtcattct cctggcacaa 1080 
atcttgagcc caagtaatga agcaattctt cacagggttg aggcccacag gaggctgcat 1140 



300 
360 



600 
660 
720 
780 
840 
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agtagtctca gcagacagcc 


agcttcattc 


aggaggcccc 


tgacatctgt 


gtcctccctg 




atgaccacct gtgcacactt 


tctcagagga 


agaagagaga 


tcatccttta 


aagccaaatg 


1260 


tgaaacgagt aaatgattaa 


gtccatgcag 


tccctaaggg 


cccattcact 


ccgagtccac 




attgaatgaa gcttcaagac 


agatatggaa 


atggcagtct 


cataaccttc 


tggctctaca 




aggcatcact cattgtagtg 


ggccagagca 


gcattgcccc 


atagcaagac 


acaggaggat 


1440 


attcccacgt gtctccatat 


tgtaaacaga 


gtaatgttat 


aattttttgt 


cagaaataaa 




gactgacttc ctttaggcag 


gtttgtctgt 


ctgctgcgtg 


tagtgtgtgt 


gtgtgtatac 




acaaagatga aaagaggtat 


tccaagtccc 


ctggagctgg 


agttacaggt 


gtacagagct 




gcctgaggtg ggtgggtgct 


gggacccaga 


ctctggtcct 


ttggcagatt 


aggaagctct 




ctaaacaaac atctgagtta 


tctctccatc 


cccataagca 


ggctttggaa 


agcagaacta 




taccaggcta catgtaattg 


ttttacataa 


tctgctgcta 


tactgcccta 


tgtaaatata 




actatttata gtgcccaccc 


actctgggaa 


ttatcacttt 


ggcttacttt 


gctatggatg 




acagacgtgt aatagaatcg 


aactfcfcttta 


cttgaattfct 


tttatatttc 






gagttttgcc tgtgataact 


gatggtctaa 


cctgaattgc 


atgctttcat 


tgtgcttgga 




tgaattcttt atatatttgg 


aaaatccagt 


ttctatagaa 


tcttattgct 


gcaagcttat 




aaaatgtcag atagggtgtt 


tagcattttt 


tgtgtgttat 


ctaatttaat 


tatcacagta 




tctctgtcac agggcattct 


gatatggtgt 


ttaagtattg 


aattgggtgg 


ttcttggttt 




aagttctgat accactgttt 


cctcttgtga 


ccattggtaa 


actttatata 


tcctgcaagt 




cttcctttgt gttatatttt 


tattttgtgt 


ctgtctgtct 


gtaccacctc 


actgtctata 




tgtgaatgta tgtatatata 


tatatatata 


tatatatata 


tatatacaca 


catatatgtg 




tgtgtgttat gtatgtatgt 


atgtgtacgc 


atatgtgtat 


gtacatatgt 


ttatgtgttc 


2400 


atctctctat gtctgtgtgt 


gtgtctgtgt 


gtatatttgt 


gtatctgtgt 


ctgtatgtct 


2460 


ctctctctct ctctctctct 


ctctctctct 


ctctctctct 


ctctgtgagt 


gcatgtgtgt 


2520 


agaagtaaga ggaagatgca 


ggcatcattc 


ctctctgggt 


tctagggttt 


gaactcaggt 


2580 


caccagggtt ggcaagcatc 


tttacactcc 


gagccatctt 


gcctgcccag 


tttcctctcc 


2640 


tatgtattgg aaataactgt 


ccctgtctta 


catgcttgta 


aggattgaga 


gagtgcatat 


2700 


gtgcaccatc atgtacacag 


cacacactca 


tttaatatcc 


attgagacag 


tgggtgtttt 


2760 
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ctcttttttc cctgggttaa tttaggttga gaaccaccag tctagctggt ataatggaaa 2820 

agacccatgg gctacttttg aagtaaaggg ggatttattt aggcttagaa ttttggagtc 2880 

aaacctaagg ctaggcagcc cattacttta ggccttgcac gccatggcag gagcacatgg 2940 

gcaagtaagt gctcacttct ccagccagga agcagcgtga gagactgact cagggtccac 3000 

agtctcatgg gcatagctca aggtctcagc ttcgaagatc agcagcacct acctaccaga 3060 

cctgccaccc tgggtatcga attctgacct ctgagggggc acccagctct aaagaaaacc 3120 

atctagctga ctctatcaaa gctgctgttt ttatagaagt tggtaaccaa ggcttaaaag 3180 

tatttgaatt atttccagtt ctgtgccttg tgctcaccaa aggctacagt gttgccaaca 3240 

gtcaagcagc tctgtgtggt tggtgccagg gacccatgca tggggtaggg ctcagaacct 3300 

gcagtcctta tctgggtact cagccagcct ggagcagatg aagctgaggc atgtcaatca 3360 

actaggagag aatttccagt gggaaaatag ggatgcccag cttgtgctca ggctttgggt 3420 
ttt 

<210> 11 
<211> 3340 
<212> DNA 
<213> MUS musculus 

<400> 11 

gctcttccgt ccaccccttt ccaagtcctc agtgcctcgg tctctcaccc tctctgcatc 



3423 



120 
180 
240 
300 



caagaccccc aggctcggtt ccgggcgcca cttcctcctc ttcaggtcag atctgttccc 
ctggacccag gttctctggt aggtgacctg gaaggccttt gtccccgggg gcggggccgg 
caccggcaca ggacacctgc aatgtcaccg tcttccacga gggtgcgcgc tagctagcac 
cacttcttag agctgagaga cctttcaaat cctgcttaag agtcccgggc tttctctaca 
gccttatcag ataagagtca ttaacaggag gtggagctga aggagaaaag aagccctagt 360 
gaaaagaaag aatgcataaa gtagccaaca ctgctgggag ttctagagat taagggaaaa 
ctttcactac ctgcttcctg acttgaattt cataaattac ggtattgttt attttattac 
tgtttttgtg ttttgttttt tagagacagg gtttctttgt gtagctctgg ttgtcatgaa 
acttgctctg tagaatagct taacctcaaa ctcacagaga tccgcctgcc tctgcctccc 
aagtgctggg attaaaggcg tgcaccacca ccgcctggca cttattttac taatttttaa 660 



420 
480 
540 
600 
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720 
780 



aatttgtatt tttataggct agagagataa ctgggtgctt caaagctgcc gttgaggaga 
acatgggttc gagtatcagc tcacagctgt ttgtaattcg tttcagtgaa cctgatgctc 
tctctggtct ccatggactc cagacacaca tgtgatgctc aggcaggcac acatacaggc 840 
aaaacatcaa aacgcataaa ataagttttt tttctaaatt ttgtaaatat gtgtgtgggg 
attacatgta ggggaaggca tgtgcccaca gaagccagaa gagggtgatc tatcccttga 
agctgaaatc acagatggtt gtgagctgcc tggcatgggt ggtgggaacc aaactcaggt 1020 
cttctgcaag atcaatcgtc actctgaact tctgaaccat ccttctggtc ccttttttat 1080 
tttttttatt aaagatttat ttattgtata tgagtacact gtagctgtct tcagacacag 1140 
cagaagagga catcggatct cattacagat ggttgtgagc caccatatgg ttgctgggat 1200 
ttgaactcag gacctctggg agaacagttg gtgctcttaa cgttgagcca tctctccagc 
ctcccctttt ttattttcaa ccttagatgc catatctttt gagctgtgca cctcgttttg 
ttgtcattgt tgtacatatt ttacaacttt gagagactat ctgtgtgtat acatacatgt 
gtgcaagtgc acaccaccat gtaggtgtgc ctttgtgcag actagaggtc agcactgggt 
gtcttcctcg gttgtgctcc atcttcttct ctgagtcagg gcctcggatg aaggccgaat 
cttgttgatt ctgctttggt ggctggccgg ccggcctgcg agctcagaga gcctcctgtc 
tccacctcca cagggctggc attatagttc ccgtgttgga catggaaccc aggccttcgt 
gctcatgtgg caggcctgag ccatctccct gacccccgtg tactcccatg tcacaaggag 
acgtacacaa aacccgcctg tctccagaaa cccaagttca gttctgccca ctaggtgctg 
agaaagataa cgttgagaag gggatttagg gatttctaat tgccaagctg tgctgcttct 
accttctacg cacagtgagg aggagatcgg aaaggagaga tttgggctca gaggggcatg 
caggggtgtg gagctggatt cagacttctt atggttttgg tgttttcctc tatgaatcat 
ctctgccatt ggcactgaag agcatgcgcc tcacacactt atggtcccat gacctctagg 
aattacttgt agggattaca tggaatctgt gatgccgcta ccattttaat tattgcaatg 
tataagcaca gaagcatgtg tgtatgtatg tgtgtgtaca tgtctgtgtg cctgtgtgct 
tgtctgtgtc catgtgtgct catatgtgtg tgtccatgtg cacatgagtg tgtgtgtgtg 
tgtgtgtacg agcatgtgtg catgtgctct tgtgcacatg tgtgcacatt tttccatttc 
tttttaaaat gtttttaatt taattttttt ttttttatgt tatgaaatcc gcctgcttct 



900 

960 



1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
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gcctccgaag tgctgggatc aaaggcgtgc gccaccaccc ctcagagagt cttccatttc 2340 
tttaatcagt ttttcagaat ctggtctgag gatgtagctc agttggggtg cttgcctaac 2400 
acgtttgagt gctgtattca aacctaggca tcgcgtaaac acagtgtggt ggagaacacc 2460 
tgtaattcca ggaggcagaa ggatcagaag tccttgagca gccttggtta atagggagtt 2520 
caatgccggc ctgggcaggc atctctgaac caaaataaac ttgaattttc aggtacaggt 2580 
tttccctctc tgagtcctta gaactgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 2640 
tgtgtgtcgt ^aagggcatc gctttctgat ttgtcttttt taatatttga tgtctcctct 
ctctgtgttt ttcaagactg ggtttctctg tgaagccctg gatatcctga actcactctg 
tagaccagac tggcttggta ctcagagatc tgagtgtctc tgcctgtctt ccaagtgcta 
ggactaaagg cttttgccac cactgcctgg ctagaaatat ttctcaagag ctagagaggc 
agctcattag ttaagagcgc ttgctcttgc agaagagcag tgttgggttc ccagctcctc 
gaccaggtag ttcacatagt ccaggaactc catctccagg ggatccaacg ccttcttctg 
gcctcagtgg gcacttgcag tttccttgta cacatattac taaataaaaa gaaatcttta 
gaaaaataaa ataaacaaca cagctgagta acttaaatta ttcacatggc tcccggcttg 3120 
tggctgccca ttctgcctca ctaaggtcgc atctttggag ttaaaatcag tatagagctg 3180 
atgttctcct taatcttcag acaccacagt cacctgtagc tccagttcca gggttttgat 3240 
gccctcttct agtctctaag ggcactacac actcgtagtg aacatacata catacatact 3300 
tgcaggcaaa acattcgtac tcattaaatt ttttttaaag 3340 

<210> 12 

<211> 3933 

<212> DNA 

<213> Mus musculus 

<400> 12 

gtctctcccc tccggctctc aggcatacag cttcccgggc caaaggcaga actctgccct 60 

tggcaaagct tgccttacct gtaccaagat caccagcctc tccgttctct tcaggaaccc 120 

ggcatctgag actcagctgt atgactagag gacatttcct cttagccttt ctttccacag 180 

gcatccgggc gtgctcctca cactggtgtt ggcgaggcat ggggcttcct cacagtggtg 240 

gggctggagt ggcttgtgtg cctcggtggg tttgtgagct gggtggaccc tgaccccaga 300 

gcctggtagg gtctgtctga gcggtgaggg tgtgtgaggc atggtgacta gcccgtgccc 360 



2700 
2760 
2820 
2880 
2940 
3000 
3060 
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cacttctacc cctacctgct ctgccctcat cgtccctggc actgctggtg ggccctaccc 420 

tgtccactgt gtctctccat gagtcctaag tcctctgtga gcggcgtggt ccgtgtgcac 480 

ctgtgtgcac gtgtgtgtgc gaggggcaca ggagtcctgt cttgtctctg tgctgtgtgg 540 
gcagatggag gttggcctgt ttttactctc tctgtgtttc tccttgtctt tttttattcc 
ctctcatcct catcgcactc tgccatcaac ccaaactctc atctctcaga tcagcgagaa 

ggttggtcgt tttcacttct tatccatcta cagttcgccc accgactgtg cccgtcagtt 720 

gggaccagcg ggcttggtcg gctccctcag gctcgctcca gccgcgcctg cccgccagtt 780 

ggcctcttct cggtgtttgg gctggctggc aggcaggacc agggatgggt gggcaggcct 840 

tctgccctgc atgtacctgt tgcttgaccc tgattcgctg tgtgtctgca tgtcccctta 900 

gccaggctcc gctgttggag tggccacaca tgcacggtgc acggcataaa actgtaccct 960 

acagaactgc actcagcttt gcacccagac caggctagcc ctctgtttca ggaatggtca 1020 

tggggatctc tgtcagagaa actaaactac atgcttggca tgtaggagtt cagggaagct 1080 

tctaccattt ccccacccac ccttatgttg tctgcggaac taagagaagg cagaactgca 1140 

ggaggcaata ggaggaggct aatttagagt gagttgtccc cttgagatct cacagtacgc 1200 

gacacagtcc tgcagtgtct gccctgccct tgaggccccc agagccttac caatgctcca 1260 

caaagoagac tggctgtact gagacttaag accagccttg gtctgcctct caaagaccct 1320 

gatggccgta gtccaaggca ggtggcatca ttttcataca ctgagtaagt gtcccgacat 1380 

tatatctttt ctgtctccta ccgggtccaa accaaggtca ccctgcctgc ccagaccact 144 0 

ctacccgagc ttccaagcca gcctctgtgc ctctccgagg gtacagtgta tgtggctcac 1500 

tgaccagcta acagccctca gtcccatgct gggcagtggc actgagcgtt cagatccagt 1560 

ttgcttatgc ctggagtgac tgaattgtca ctgcagtctc cttaggcctt gagtatcaat 1620 

cagtccagct cagccttaga ttgggcaggg ctttttgaaa ctccctggat cctctcccag 1680 

agctgtcccc tggagccagg cctctatttc ccagtacagg accaacaagg ggctggtggt 1740 

cctttatata tcgtgtggtg taaagggtta atcactgggc tttggggggt ccaaggaaga 1800 

agcacagggc agaccccagc taccaggtgg cctctgcatc atgctctgtg gctttgagtt 1860 

tactccggga ggacaaggga gccagtcagg aggagggctg cagcctgggc tcagatatca 1920 

acataagcct ccatctgtat gcactcctga gctctggggc atttcggtta caccatttcg 1980 



600 
660 
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ctgaccaggc ggatctgcca gggcgttcag aaaatagtca tcagcagaca gcccagcccg 2 040 

cagccactcc ccccacaccc cttcacctac cctctgcatc ctttgttctc tccccaccca 2100 

ccctgtggtc ccaccaaaaa tacacacata cacaaatatt taacgggaag gagaaatgag 2160 

tttctaaata ttccgggggg atttgccggg ctggtaattt gtttggtgct gataattgca 2220 

tcttacattt ttctagcttt acttaaaact gtgtgccggg tgtgccagca tttaattact 2280 

gctctgcggc agccacaagg ttatttatta aagagttatt ttatcttgat acagtggatc 2340 

ctgcccctta tcccctcctt aatgttgtgt tattttcatc agagaaattt ccgcaagtga 24 00 

atgcagattg cggggctacc tccccctgcg attaggctgt cactccactg aatgcggata 2460 

cctccaggcg ccgcaccacc ctattatagg gggttgtcag cgcaaataat tagacttaaa 2520 

agttacagct gaaatataat ccagaaatgg cagggccctg ttttggaaat tgtctataaa 2580 

atgtcagcag taaggatgca cggggaacag taatagaccg gcattgttgg agcctgagat 2640 

tagaccctaa gtgcattttc cccagctcca gtttttcctt ccctgctgtt gcttctgtca 2700 

atagaccaag tccagggaga gtcctgttcc ctttggaggc cctgtgcttg tggcggccgg 2760 

gggaggacgg tggagatgct atgttggagc atcagccact tgcaactgtt ggcacaggag 2820 

tagctgtcta ggctgggcta ggacacaggg cctctagcat ttggggcact ttgttgcctc 2880 

tttccccatt ttcagtagat atgctggcct acctgagctc tagcagattg acttccagga 2940 

gtctttgcat gtggctggac cccagtgccc ttctatagga atgtagttat cagtggaata 3 000 

cctggggcct ctggctgagt tcctttccca gtgctgaggt gaccctgacc tcacacctgt 3 060 

ctgtagcccc caaagttctc cctatagctt gcatcttgga gggaaaataa aagccattct 3120 

tagccgggcg tggtggcaca cgcctttaat cccagcactc aggaggcaaa ggcaggggga 3180 

tttctgagtt cgaggccagc ctggtctaca aagtgagttc caggacagcc agggctatac 3240 

agagaaaccc tgtctcgaaa aaccaaaaaa aaagaaaaag ccattcttag tgtcacttcc 3300 

catgggggcc cagtttcctg acatcgacta gagatggagc ctttgaaagg agcggcccag 33 60 

ggcattggcc tggctccgaa gctgatccct gaggactctg ctggcttgga aaacactgtc 342 0 

caccatggac tatccagggg tcaaagtttg gacatgttta ggtatgggcc ccaggtattt 34 80 

tagccaaaga cctacttcct actgcataaa cccagtggcc cctctttcat ttgggttcca 354 0 

ccattaacta gagccaccat taactagtgt cactctcaaa agtcttatct gtatgccatc 3600 
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tggagctcga cattatgcca acgctaaagc cccatggcct tcatgggctt tcgaagatag 3 660 

agtttttacc acacagacct gatcttcctg aggatataaa ggcagatgcg tcacttcctg 3 720 

tgtcagcatt gactctggcc ctcatctgca attgagacat ggtggcacag gcaccctctg 3780 

tacagagtac agagtggctt tccactaggc atgatgtgca tggctttaat cccagcactc 3840 

agaaggcaga ggcaggtaga cctttgtgag ^ttcaaggcca ccctggtgta tacactaagc 3900 

ttcaggatag ccagggatat ataaaccctg tec 3933 

<210> 13 

<211> 2272 

<212> DNA 

<213> Mus musculus 

<400> 13 

gatagatgat agatagatag atagatagat agatagatag attagataga tataaagtgg 60 
aaatagaagt tccaagtcca ctgtgaggag ccagggatgg cactgttcca tgtaagtgtt 120 
ccctggagac atttgatctc tacagttgtc tggcctggcc ctcttagact gcactgtcca 180 
ataaaactat tgtagaactg tttaaatcca atttgaaatt tcaaatagcc acataaaaat 240 
gtaaaagaag ataaagtcag aggtagagtg caagtttagc atttataata ctctgggttc 3 00 
tatcaccagc acagactatc agatgataaa tgatagatga tagatgatag atagatagat 360 
agatagatag atagatagat agatagatag atagatattc attccttttt agtaagttca 
aatagtcttt gaacatgtaa tccacataaa gctaattaca aaataaatat attacatgga 
tttttatatc atgtctttaa agttttatct gttgtgcata ttttaattca gatcagtcat 540 
aagtagctgc catattggac agtggagcct tacaagactg acccttcccg ctgccttcca 
aaacatgttc acatttagag gatggactct gccaacactg cctaagcccc ttcattttac 
aggtgaagat gaccagatct agagatgctg ggttgcttag aatcccatag ccattcacat 
cagaagcccc acccttttgt ctgacttcaa ttgctttcat cattctttct ttccttagat 
ttgtgttgct gcaactactc actgtgagac tgttgctgtt ggttaaatat tgagccagga 840 
gtcttaaaaa catttggtcc actcagtaaa gccactgttt gatttggtgg aaactgtttt 
gagcagagta ggtagaacta ccaggcacat tgacagttct ccatgagata gtaataggag 
gtgataatgg ggacaggggt gactttatgc ttgttggtgt tgctgtgggc attataggag 1020 



420 
480 



600 
660 

720 
780 



900 
960 
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tatgtgactt caaattttga aatgctagct gcctgcttac aggaagggag atgagctgtt 1080 

cccatgcaaa gtttattgtc tagttgtatg tataagccta cacatgatga atatcttaat 1140 

gcttaactct ctctggaaat ggcagggtta agcttttata tttcatattt ttttcttgta 1200 

ttaggaacgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtat 1260 

gtgtgtgtgt gtgaatgtac tcctatttct cactcagtta tacttaaagt gtgatgattt 1320 

ttgacaagtt aatcccagta cactcctggc atgtaaagta ttttcagaat gtaagcccat 13 80 

gtggaaattc gttggcttgg tatgacgggt tcgggcaggg gaggcatagt gtgatcccgt 1440 

ggagtgttcg gagctgccaa gtatttgttc ttcggtgtga aagcgtgctt gacgggtgga 1500 

ctttcctagc tggaaggtgc tttgttcatc cttcctaaaa catttatcag cacggatgtt 1560 
gtagatcgtg taagagagtc ccctttattc atgtgactta gttcagtgat cttagattta 
gagttaagtt tttgtgttcc agggttggga ctaattccca gccatggccc agctttaccc 

acaaaggaac taccgacctg taaggatatt catttgatgc gtttgctttt gttcctggtg 1740 

ccagtcattg gattttgcct atagtgcttt atgtggctat acttagaagt tttatagctt 1800 

gtagcaatca ggtgaaagta cagggatatg gctgagtact gtaacttgcg tctgtaatcc 1860 

cagccagagg caggaggatc accagtccag ggcaatttga gctacagagg caaactgaaa 1920 

aaggaaaaga aaaaaaaaaa gagattcctt tatttccttt ggccttcata cacaaatatt 1980 

taaattctta atgtacggtt ttaagtcagc cccctacctc ccccaccttg gtagtttgca 2040 

tagtacacat tagcatttga aacaaaagtt ttgagctata agatgctcat gggagtcatt 2100 

ctgaaatatg tttaaggtgg attgattcgg aaaggatgca aatccaagtt aggctggctg 2160 

aggcctcccc gcttctttag agggtgatac ggaaggtttg aggacccagt ctgcttcggt 2220 

cggagtgtgg acagccagtg agggcagtgc agcaaaaccg tgtctcagaa tc 2272 

<210> 14 

<211> 1554 

<212> DNA 

<213> Mus mus cuius 

<400> 14 

atgtatgtac acacatatgt gtatatgtgt gtgtatgtgt gtatatatat gtacatatat 60 

atgtgtgtgt atgtgtatat atgtatacat acacacacaa acacatatat atatatcaca 120 

gaaagttatg gaccatttta aaatatcact atttgtcatt gatttagttt tacacaatac 180 



1620 
1680 



25/186 



wo 2005/005597 



PCT/US2003/027106 



tattctccag taacgtttct tctttttcca tttctttcat ttcttttctc tcttctttta 240 

ctaaattggc acacttgttt acttctcagg ttttgcaaac atccttaata tgttggtatg 300 

gcaaatgcat ttcaattcaa atagcaactg caagttgata tgcagaagaa cagaagtaat 360 

gtggtctgga attaatttgc tcaattgctt acaagcaata tgctatgtat gaaaagatta 420 

ccaaattaca ggtattaaat tctaaatgat ttaacagttt attatgattt attctattta 480 

acaacctagg gagtttgaag ccattggaaa ctccagtggc aagttcaggt gcagttggga 54 0 

gttcaggaat ctcaatggag ctattagtgt agcagacttg tgggtaaaat atatgcatct 600 

ctcaatagca aatgtgaaag actctctaaa atggacctca gtgacttcat tctccacaag 660 

ctgtcctggc ccactcagca tatcctttat tgttccatga tattaaggca tgtctgcttt 720 

tttttttttt taataataaa tccatgacca cctgaaggtt ttaacttggt atgcacggca 780 

ttaaggcttt cacagccacc ataggtattt tctcctccat ccattatgtt attataaact 840 

attactttga tgccatagga actttottag tttttcctac acattctggg acactggcag 900 

aaataacttc ttccagatta tagttgcttt gtgcaacagg aaggagatca gtcaacaatg 960 

gatttataaa ctctcagata acacagaaag ttacatctca tgtccaaaaa tgaccttatt 1020 

acttggccct aatttagttt tgcatgaatt tttaaagcag gaatcataaa tggcccaagt 1080 

gtcttaattc aaattcttac ctgactgtgg agaagaaatc attgtgattt ttgaacacac 1140 

ataattatgg tttaaaaaca aactaacctg aatttgtttg aatatagttt ttcaactttc 1200 

tctgatacta ataaactcat cagccagttg gaaattgctt tccagaaaaa attctcattg 1260 

cttatatgtt catatatttg tacatgtata aaagatttca aaaacttgga ataagtttag 1320 

accattctta ataaggatat taatggatat ttaaagtgcc tgtcttttca gtgtttggaa 1380 

atttatttag tgtttcttca gggcgtatca aatcaataaa atgtccatgt ctgtgatgac 1440 

taggaagatg ttcttctatt ttctttctct tgctgcataa cctgtcttat tcagaactaa 1500 

atttcaaatg tgaacagttt tagctgaacc actgaattaa aataattatg aact 1554 

<210> 15 

<211> 4007 

<212> DNA. 

<213> Mus musculus 
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<400> 15 ' 

tgtgtgtgtg tgtgtacatg tgagttcagg ggcacatgga agccagaagg aggtgtgaga 60 

agctccacat tggtcctctc caagagcagg agatgctgtt taccactaag ccatctctct 12 0 

gtccccagga agtgacaatc ttcaggcagt ttgtctgcat tttatacttt ctatttaatt 180 
ataatattaa aattttgaag cacacactga gaaatctaaa ggtggtttat tcctttccca 
tttttttaaa gagagtcaaa taaaggttca tttcctcaga ggaggtcagg tgaaatacaa 

gggcttattg tcactcaaga gattcttccc aagaaaagtc acattaagaa aggaaaggag 360 

agagacatgt aggtcaaaac agagataatg aagtcaagca aagataaatg tgacttaaat 420 

aggcatttaa tgtggctgaa tcatatttaa ccatgtgttc tggacatata cccaaagggt 480 

gagtccggtg gtatatcacc ttaggtctaa ggggaggaag gagcattctg attgagaagt 540 
gttcatcact tttcttctgg gactaaggcc tttcccataa tgatggttta attcaatctc 
gtcaattacc tcttctcaga gagctgcagg ttgtgccacc ttgctggaaa tggtcaaatc 

ccattaaact tctgttcttt gtagcagagg gaaaaggaaa gatccagcga ttcaattaag 720 
acataaatga ccaggtgtct gtggcacaga atcagctttt ctggtctgac caagaaaacc 
caatttccat cttggctagg aagctatgtg agccctcact cacctttgct gttagcttcc 
tggccatggt taaaggctca gactgcacag gaacacacaa gagttttctt tgtattcaca 
gtggaagagc ccccctcccc acggagttgg gtggagtggc cagaagacct tctccatccc 

tgccctttgc tgctttgttt ccagaaagga gggggaggaa attaccaggg tatgagatgc 1020 

tgcctcttgg gaacaaggaa attaatgcag cgacctgaat tatgtgcgtg acacatttgc 1080 

atatcgctca ctagcagttt ctgaagaaac aacaaatttt gtgtaattct aatctctttt 1140 

gcaaagcaag ggaagaggaa aagctacctc cgatttactg tctacaaaga aagctgaatt 1200 

ttaggatcaa atttgctcac atttagtggt aggctaaatg ttaaaacata agggtgcatc 1260 

ttttaaatca ctctttgtgt gtggctgagt gtaaatgtca ggcatgtgta catgcatggg 1320 

tgcatgtgtg catgggtacg tgtatgcatg catgcttgtg tacgtgtgtg cttgtgtgcg 1380 

tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgcacatg tatgcatgtg 1440 

gaggacagag ttcaatgaca ggtctcattc tctatcatta tcaatgataa tttatagctt 1500 

gaatatcttg tatcattata cctatgtctc ccagatcatt atattcattt gggagggtat 1560 

ctagtataat ttactattat tatggtaatc atggaatttg ggggatggaa tgactgattc 1620 



240 
300 



600 
660 



780 
840 
900 
960 
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attgtcagtt tataaccagg ctgatgtaga aaactcaaat gaagaccaat tatcaatgtc 1680 

tttgtagtca ataaaattaa tatctggaga agtgagtccc tctaaggaaa atcttctagt 1740 

cagagaaggc ataggaaggg agaatgctta aagtttgtat caacttctct ttgtgaggtt 1800 

taagcaattc tcatgttgca atcggccagt atcactgtaa tcctcgagat ttctagaaga 1860 

gcaaattggt gaagcttgca agccttttct gaaataagga aaattggcat agtttagaaa 1920 

aaaaaaatgt tgggagattt cttaacaatt gctttttgct gaacaaaagg ctggtacttt 1980 

ccttgatcaa gttgcagaaa tgctgtctgc tctgtgagtt ttgtagagag aaaaattaca 2040 

cattttataa acctggcttg gctgcttttc tgttttgatt ttgacttctg attctgctct 2100 

gattcgactt agctaccaaa gttgacagat tattgaccca cagtcacagc ttataccttc 2160 

tgtctctccc tccgtcccta gaagtctctg tctcttcact cctcctatca cagtgttaat 2220 

ggtgatatcc agatgacttt gacccaagca gaggcacaca gggtagagct gacatcatta 2280 

accattcact tccttcccct tcatttgagt tttcttatca gataaaggtt gctagtgttg 2340 

aagtcatagt ttccaaagaa acaaggcttg gggatgctcc aggtactaca ttcagagctg 2400 

ggggggctag atttggaaca tgccaagtat tttgaggcca gaagcacaca tcttttctct 2460 

tcgtgaagat gctgtgctca gcaagaataa tgacaacaca caagccatgg actcagtagc 2520 

actgaagtgt ggttcgaagt gggcttcatg tattatttaa ttgttagtct gaccttcaga 2580 

ctagaaccct ggagctataa caaggataat gtatttaagt aacccagctg gtaagtggta 2640 

aagcatggac tgaccacaga ttcctatctc tcgggaccag aagtttaatc agagaggtaa 2700 

atcgaatcaa ccaaagaaaa ttacgcttta agttaaatta agcattctaa aattatctag 2750 

taatctcatc tctgcttctc gacacatgtc agatttgtat tggtactctc aaaattaatt 2820 
taccatatgt ttgtgtctct tgtgctagac tttggagcaa ctgttagaga tcaataaaca 2880 
agctctgaca catttcatac tctcatgaag ttctgccttg gagaatctcc ctagtttgag 2940 
tttctcaggt taaaattctt ttcaagaatt aattaaatta cttcactgag ttttagttcc 3000 
tgtctctctg cttgcagatt catatgggga cccatttcat cattatattg atttagattc 3060 
ttgccattca aaacgcttgg catccattgt cagccatact acagaaacag agatgcaccc 3120 
ttgtatagtg tgcagcgcac attccctcaa aatatttact atgtagttct ctatacgtta 3180 
ggaaatctat tcattcaaca cagatgttga gcctcattta atttactttc tccactttac 3240 
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tgtatttttc ccagtatatt tttgccccgt ggtttttatg ttagaagaag ggccaacaga 33 00 

aagatagcag tgagttgggt aacaagtgtt tctttttgca aagtaaaaga cgccgttgct 33 60 

gggaaagcag taagcgtgaa cagggaggac tgtctaaatg actgacgcag ttggtgatga 3420 

cgtcagcact gggcactgac ttaacagaac tccactggga cttgtctatc ttctgtttgt 3480 

ccagtttgca gctaatccct agcaatgtcc tcagaccaca ggagaagggc taaattgctg 3540 

attaaaattc cttaagaaac aaaataactt caactgtatc ttccttgatt ttggttttta 3600 

gttcctagac actgtaaaat gacacacaca ggtgctgctt acctgctgct gtggcatcaa 3660 

tattgataaa gaaatgtact gcaaagtgtt tcatccatac ccagagcagg tgctagagca 3720 

attcattgta tatcatataa gattagagtc ttaatccttt tttttttcag atatacgtgc 3780 

atacacacac agatacagac acacacagag aaagacacac agacagagtg gcgcgcacac 3840 

acaaacaaat gctgtgctat ttgtatgtat aaccaacctg ggtagtatct gtgtactgtg 3900 

aaccctctgt ggttgtggtt tagccatcag agacgagcca tcagagccca gcctgatgtt 3960 

agagcccagc tcctgtattt ctcggctgag cagttctttt tccagct 4007 

<210> 16 

<211> 2755 

<212> DNA 

<213> Mus niusculus 

<400> 16 

gaacgatagg gccaaaaagt gggagtgggt gggtagggga gcagggcagg ggggagggta 60 

taggggactt tcaggatagc atttgaaatg taaatgaaga aaatatctaa taaaaaattg 120 

aaaaaaaaag tctgtttcca tacccaaaag atattcaaat atctgatata aagtctctaa 180 

ggagagtaaa tagagcctct tatcaaaatc tagcatttgc cattcacttg agtaagatat 240 

gcaagatgaa gttttaccag acctttgtgt gacagaactt ttcctgggga tacacaatac 3 00 

acatatctac tcacccagac agggaatcca tgactgacca cagtacagat accaccgaag 360 

tccaacttgg taatccaacg aattttatta cgattactta cgggcgtatg gatgaggagt 420 

tacttaaagt agcagaaacg actgagaaga ctgtgtcacc aaagcccacc acagcatagg 480 

ggatgactta caaagctggg aaccaggagc acactacaca gtctgcaggc agctcaacca 540 

gttggggatt gtccttgcca ggtgcctcag ttggtctaaa ccttttccag gcagttggtc 600 
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tgggctcagt cttctctgca gcttggtttc tctgagagtt gacacagctc aacttccttc 660 

tgtctgagag agactttcag ctttttacaa ttcaaatgtt atcccctttc ctagtttccc 720 

ctctgaaaat cccctgtcct ctcctccctc tcgctgctcc ccaacccacc cactcctgct 780 

tcctggcctt ggcattcccc tataccaggg catagaaatg tcacaggatc aagggcttct 840 

cctctcaatg atgacctact aggccatcct ctgctacata tgcaattaga gccttgagtc 900 

cctccatgtg atttctttga ttggtggttt agtcccaggg aactctgggg ttgctggtta 960 

gttcatattg ttgttcctcc tagggggcta cagacccctt cagctccttc attggggacc 1020 

ctgtgctcca tctaatggat gactgtgagc atccacttct gtgttcatca ggcgctggca 1080 

gagcctctca agagacagtt atatcaggct cctgtcagca agctcttgtt ggcatctgca 1140 

atagtgtctg ggtttggtgg ttgtttatgg gatggatcct cagaagggac agtttttgga 1200 

tggccattac tttagtttct gotccaaatg ttgtctctaa taactcaggt attttgttct 1260 

cctttctaag aaggatcaaa gtatccacac tttggtcttc cttcttcttg agtttcatgt 1320 

gttttacaaa ttgtatcttg ggtattctga gattctcggc taatatccac ttatcagtga 1380 

gtgtatatca tgtgtgttct tttgtgattg ggttacctca ctcaagatga tatcctccag 1440 

atccatccat ttgtctaaga atttcatgaa ttcattgttt ttaatagctg agtagtactc 1500 

cattgtgtaa atgtaccacg ttttctttac ccattcctct gttgagggac atctgggttc 1560 

tttccagctt ctggctatta taaataaggc tgctatgagc atagtggagc atgtgtcctt 1620 

cttaccggtt ggaacatctt ctggatatat gcccaggaga ggtattgcag gatcctctgg 1680 

tggtactata tccaattttc tgaggaacca cctgac'tgat ttccagagtg gttgtacaag 1740 

cttgcaatcc catcagcaat ggaggagtgt tcctctttct ctacatcctc accagcatct 1800 

gctgtcgcct gagtttttta tcttagccat tctgacttgt gtgagatgga atctcagggt 1860 

tgttttgatt tgcatttccc tgatgattaa ggatgttgaa cttttttttt tttttttagg 1920 

tgcttctcag ccatttggta tttctcagtt gagaattctt tgtttagctc ttaccccact 1980 

tttaaatagg gttatttggt tttctggagt ccaacttctt gagttctctc tctatatatt 2040 

ggatattagc ccccttttgg atttaggatt ggtaatgatc ttttcccaat ctgttggttg 2100 

ccattttgtc ttattgacag tgtcctttgc cttatagaag ctttgtaatt ttatggcatc 2160 

ccatttgttg attcttgatc ttacagcaca agccattgct gttctgttca ggaatttttc 2220 
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tcctgtgtcc atatctttga ggctcttcct cactttctcc tctgtaagtt tcggtgtcgc 2280 

cagcactgct tttgcttact ctgtcaggga cgggttgaat caatctggtc cgtttcagag 2340 

agtttctgat gctgttttaa tatggacaaa actgttacat aaaacacttt tagtcaaacc 2400 

ttcacatgtg ataccaacca gggtcacatt tgttaacctc cagagagatg aacaaagtac 2460 

actcagaaaa ccctcttcag attcccaacc taattactct caaacagaaa tactactttt 2520 

gtttttgttt ttcagataga gtgtctctat aaagccctgg ttatcctaga aatggcttat 2580 

gtagaccaga ttggccttga atccatcaag atccacctac ctctgctttc tgagtgctgg 2640 

attaaaggca tgtaccacca tgccaggtta aaaaaaacca cacatacaaa aataatacaa 2700 



aactaatcaa 


ccaaacaacc 


aaccaaaaaa caaacaaaca aacaaacccc aaacc 




<210> 17 

<211> 1811 

<212> DNA 

<213> Mus musculus 






<400> 17 
ttttacatgt 


acctagagaa 


agaaaacaaa aaaacaaaaa aaccaaaggc tcattacatg 


60 


ttgagtgtct 


gactgaaagt 


gtagccctgg gactccatgt gtaacagtag caatgggatt 


120 


gcagcaaata 


agatgggaga 


agacactcag gtatggctgc attgaggaaa ctaatcatat 


180 


ccagcatttc 


tggtgttgaa 


ttaaactgac atggtgaaag tcaagctttg gttttgtaaa 


240 


tgccagagaa 


aaaacagaga 


acattgcaca gtaaacctga gtttaaatgg cctcagtgtt 


300 


tgaggcaagc 


tttgaagcta 


gggtgtcaat atctgtgttt tctgtaacta atctgaagag 


360 


ccctcgagag 


tactgtgtct 


gtgaggcctg aagttcagat ttctctccag agttcttttg 


420 


attgttctta 


tgtacccagg 


agaagtcatt gtgattctta gaatggatca tttgagtgat 


480 


tactgctcct 


ctggtggctg 


acaaacaaca tggtcccaaa agaatttcag gggtaagacc 


540 


tttcggtctt 


agaacatcca 


catgtggcag ggcacattgg ccctttctta cccagaactc 


600 


ttttgtgtgg 


ctgcatctct 


ttctccagtt ctctcttaac tggagcatgc ctattctttg 


660 


tccttttgtc 


tttagtatct 


ttgttgttgt tagatgccat ctgaagttac actcacaagt 


720 


ttaagcaaat 


gctgccttct 


tgtcagcctt gtgattctgt gtgcatttca tgcaagggaa 


780 


cctgtatgtt 


tgctctctgc 


attcttacat ttcatattgt ctcttcttcc accacatggt 


840 


aaatgttatg 


agtacacatt 


tgtcttcatc aacgaaaatg catgtgaaaa cctctctgtc 


900 
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cttcttgcat 


cttagctttg ttttgttctc acatttttac tgatattcca caagaaagat 




ttgccatatc 


catttaggaa acattgagcg atagatagtt ctcctagaaa ttctcaaaga 




catgttattt 


accatatcaa actctcagga atgtattgga aaattatcct agtaacattg 




ctgttactta 


tacctgctgc tggaggaaaa aagtttattc atgacacatt tacctttgat 




cataaataaa 


agagaaaggg ccaccttttt ggagttagtc atggtagtca ttagtgatat 




ttttgaaccg 


tttttaattt gaaatacttc aaaggaagta aaggtcatgg cttagctcaa 




aaaaatgctc 


cagaagttgg gctgcttaaa tcatcagtac aataatatac tgtgtgtgcg 




tgcatgcgtg 


tgtgtgtgtg tgtgtgtgca tgcgtgtgtg tgtgtgtgtg tgtgtgtgtg 




tgtgtgtgtg 


tatgaaatga gaattgccac ataattggag catttgcatt catccgcaaa 




tcatgttgag 


acaaactgta gctggccgtc ttgattaaag ccaagtggtc cctgtggctg 




tgaggaagag 


tttttcttgc aaaaactttg gcaagtgatg acttcgaaac ttacaaaggc 


1560 


tattgccttt 


ttttttttta acaccagtag tgaccttcag ttccttcagt ctgagtttaa 


1620 


gggtagaatt 


ctaaatttgt attctaatct gtcttttgtg gaaaattttg aaaatagtat 


1680 


gtatatgtaa 


tattgtatat gcaaattgtg ttgttttact tgttttgcat atgaccagca 


174 0 


ctgactgaaa 


ggcatgttta actataaaca ctgttgcttt ctttgtgaaa tgaaaataaa 


1800 


agtatttaaa 


t 


1811 



<210> 18 

<211> 2438 

<212> DNA 

<213> Mus musculus 

<:400> 18 

gcaccgtttg gtgtgtgtgt gtgtgtgtat gtgtgtgtac atgcgcgcat gtgcatttgt 60 

gtgtgtgtgt gtatgtgtgt gtatttgtgc ttacacatgt gcacgtgaaa gtatgtgtat 120 

atatgaatgt aagtgtgcat gtgtatgtgt gtgcatgtgt gtaagtatgt gtgttaattg 180 

tgttttttgt gagcaagtaa ataaataaat agttctttct taaagcagca aaagaaaagc 240 

agcaagtcat gtaaaaataa gtcccatcag aataacagca gatttatcag tagttatctt 300 

acagacaagc ggaggatggc attatatact caaagctgta aaaaataatt tttcaatcaa 360 

aatataccta gaaaagctat cccccaagga tgaaagagaa gtgaaaactt tcctaaagaa 420 
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aacaaaaact caagacattt gtcagcattt tatcagctct acaagagatg cccaagggag 480 

cagtttatag aaattgaagg atgactacca atatacaaat gcacaaaact acaaaaaaat 540 

gtgcaaaata gtgctaaaat atgcaggtga atgaggaatc taactctatt tgtacaaaga 600 

aatcaccaag gcacaaaatg aacaagaaag gtgggaacaa aaaaaggata cacaaaacca 660 

gaaaaaaatg acagtactga ctattcattg ttaataatgt taaatattgc actggcagtt 720 

cttctctttt ttttttattt tattagatag tttctccaat tacatttcaa atgttatccc 780 

ctttcctggt ttcccctctg aaaatccctt tcccttaact ccttccctct ccccctgctc 840 

accaacccac ccactccaac ttcctgtccc tagcattccc ctacactgga gcatggagcc 900 

ttcacaggat gaagggcctc tcctcccatt gataaccaac taggccatcc tctgctgcat 960 

atgtagctgg agccatgagc cccaccatgt gtattctttg gttgttggtt tagtccctgg 1020 

gagctctgtg gatagtgctt agttcatatt gttgttcctc ctatggagct gtaaacccct 1080 

tcaactcctt tggtcctttc tccagctcct tcaatgggga ccctgtactc agtccaatgg 1140 

atggctgtga acatccactt ctgtatttgt caggcactgg caaagcctct caggggaaag 1200 

ctatatcagg ctcctgccag caagcacttg ttggcagcta caatagtgtc tgggtttgga 1260 

tggatcccca ggtggcacag tctctggatg gtcattcttt cagtctctgc tccccatttt 1320 

gtctctgtaa ctccttccat gggtattttg ttcccacttc taggaaggat cgaagtatcc 1380 

acactttggt cttccttcct cttgagtttc atgtggtttg tgaattgtat cttgggtatt 1440 

tcgagcttct ggctagtatc cacttatcag taagtgtgta tcatgtgtgt tcttttgtga 1500 

ttgggttacc tcactcagga tgatatcctc cagatcaatc catttgccta ggaatttcat 1560 

aaattcattg tttttaataa ctaagtggta ctccattgtg taaatgtacc atattttttt 1620 

tttatccatt cctctattga ggggcatctg ggttctttcc agcttctggc tattataaat 1680 

aaggctgcta taaacatagt ggagcatgtg tccttattac ctgttggaga atcttttgga 1740 

tatatgccca ggaatggtat ggctgggtcc tcaggtagta ctatgtccaa tcttctgagg 1800 

aaccgccaga ctgatttcca aagtggttgt accagcttgc aaccccacca gcaatggagg 1860 

agtgttcctc tttctccaca tcctcaccag catctgctgt cattggagtt ttttatctta 192 0 

gccattctga ctggtgtgag gtggaatctc agggttgttt tgatttgcat ttccctggtg 1980 

actaaggatg ttaaacactt tttaggtgct tctcagtcat tcagtattcc ttagttgaga 2040 
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gttctttgtt tagctctgta cccccatttt ttaatatggt tttttgtttt tctggagtct 2100 

aacttcttga ggtctttgta tatattggat attagccctc tattggattt aggattggta 2160 

aagagatatg agcacagggg gaaaattcct gaccagagca ccaatggctt gtgctgtaag 2220 

atcaagaatt gacaaatggg acctcataaa attgcaaacc ttctgtaagg caaaggacac 2280 

tgtcaataat accaaaaggc agttcttatt tctttttctg gttttttttt gaggcagcag 2340 

agggagaaga gtgtcagcga gggtaatttt tggtcttagg agatatttag ggttgctgta 2400 

taaagcatct tcttgtatta agtctaagtc gatttagc 2438 

<210> 19 
<211> 1712 
<212> DNA 

<213> MUS musculus - 

<400> 19 

ggcagacggg agtttctcct cgggacggag caggaggcac gcggagtgag gccacgcatg 60 

agccgaagct aaccccccac cccagccgca aagagtctac atgtctaggg tctagacatg 120 

ttcagctttg tggacctccg gctcctgctc ctcttagggg ccactgccct cctgacgcat 180 

ggccaagaag acatccctga agtcagctgc atacacaatg gcctaagggt ccccaatggt 240 

gagacgtgga aacccgaggt atgcttgatc tgtatctgcc acaatggcac ggctgtgtgc 300 

gatgacgtgc aatgcaatga agaactggac tgtcccaacc cccaaagacg ggagggcgag 360 

tgctgtgctt tctgcccgga agaatacgta tcaccaaact cagaagatgt aggagtcgag 420 

ggacccaagg gagaccctgg cccccaaggc ccaaggggac ccgttggccc ccctggtgaa 480 

cctggcgagc ctggcggttc aggtccaatg ggtccccgag gtccccctgg ccctcctggc 540 

aagaatggag atgatgggga agctgggcaa gcccggccgt cctggtcccc ctgggccccc 600 

cggacccGCt ggccttggag gaaactttgc ttcccagatg tcctatggct atgatgaaaa 660 

atcagctgga gtttccgtgc ctggccccat gggtccttct ggtcctcgtg gtctccctgg 720 

cccccctggt gcacctggtc cacaaggttt ccaaggcccc cctggtgaac caggaacacc 780 

aggaggagga ggagaagaaa taatgagtga ttgtgtctcc gtttatggaa agagtttgat 840 

ggggtactaa tgttggtaaa tataatattc aaacatgaaa ttctaataaa ataagtgaaa 900 

gatatcagaa agcctt'caaa atcctgcaaa cacaatatac agaatatata ttaaatttaa 960 

ttgacaacta taccttctta gatctattgt tcatttgata attatttcaa attttttctt 1020 
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ctccatttaa tagtatggct cttaagagac ctgcagtgta tgttaagact atttggattt 1080 

gtgctttcat aaggcttaaa atgtttatgt attttttttt aattttctat ttgtatcttt 1140 

gtatagcatc tttgaagaaa tgtgtactca aagaattact catgtagaat ctcactgtgc 1200 

tgccatctgg gacatcaagt tcatttgtga acccatgctt cctctcagca tcataagaag 1260 

atcactacag aagctgatca ccacacttaa ctagtggtat tgtaaaatct cttcattaat 1320 

gactttcatt tttgcttggc ttgtgtataa aattgaattg aaagtttata ttatctcatt 1380 

tgctaacagg tttgcttaaa accctcttag aaataacatt tttattactt gtcatgtttt 144 0 

tttttattta cagcaagata tttaaacact gaatatattt tgtagtattt aatttttcat 1500 

tatttaaaac aatgtttaaa tcaaatattc aattatatta tgccattcct ccaagttctt 1560 

catattgtct cccattttct acccaacttt aagttctttg tcaataaaga tacaaaaatc 1620 

caattgaaaa caaacccctc aaaatcagga aaacacattc aaaacaaaca atcagaaaac 1680 

cccaaacaaa caataaagaa aacaacaaaa at 1712 



<210> 20 
<211> 3651 
<212> DMA 
<213> Mus n 


lus cuius 






<400> 20 
aaaagactgg 


ctagttgaga 


atgcaccagg ggatgaggtt ccagttggcc attctttaac 


60 


aggttttgcc 


actgcctttg 


actttttata taatctatta ggtaatcagc gtaaacaaaa 


120 


atacctagaa 


aaaatttgga 


ttgttactga ggaaatgtat gaatattcca agattcgatc 


180 


atggggcaaa 


caacttcttc 


ataaccatca agctacaaat atgatagctt tactcatagg 


240 


ggccttggtt 


actggagtag 


ataaaggatc taaagcaaac atatggaaac aagttgttgt 


300 


tgatgtgatg 


gaaaagacta 


tgtttctctt gaagcatatt gtagatggct cattggatga 


360 


aggtgtggcc 


tatggaagct 


atacctcaaa atcagttaca cagtatgttt ttttggcaca 


420 


acgccatttt 


aacatcaaca 


actttgataa taactggcta aaaatgcatt tttggtttta 


480 


ttatgctaca 


cttttgccag 


gctatcaaag aactgtaggc atagcagatt ccaattataa 


540 


ttggttttat 


ggtccagaga 


gccagctagt tttcttggat aagttcattt tacagaatgg 


600 


agctggaaat 


tggttagctc 


agcaaattag aaagcatcga cctaaggatg gaccaatggt 


660 
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tccttccact gctcagcggt ggagtactct tcatactgaa tacatctggt atgatccaac 720 

actcacccca cagcctcctg ttgattttgg cactgcaaaa atgcacacat ttcctaactg 780 

gggtgtcgtg acttatgggg gtgggctgcc aaacacccag accaatacct ttgtgtcttt 840 

taaatctggg aaactgggag gacgagctgt gtatgacata gttcactttc agccatattc 900 

ctggattgat ggatggagaa gctttaaccc aggacatgaa catccagatc aaaattcatt 960 

tactttcgct cctaatgggc aggtattcgt ttctgaggct ctttatggac caaaattgag 1020 

gccaccttaa caacgtattg gtgtttgccc catcaccatc aagtcaatgt aatcagccct 1080 

gggaaggtca actgggagaa tgtgcacagt ggctcaagtg gactggggaa gaggttggtg 1140 

atgcagctgg ggaagttatt actgctgctc aacatggtga taggatgttt gtgagtgggg 1200 

aagcagtgtc tgcttattct tctgccatga gactgaaaag tgtctatcgt gctttacttc 1260 

ttttaaattc acaaactctg cttgttgtcg atcatattga aaggcaagaa acttccccaa 1320 

taaattctgt cagtgccttc tttcataatt tggatattga ttttaaatac atcccataca 1380 

agtttatgaa tagatataat ggtgccatga tggatgtgtg ggatgcacac tataaaatgt 1440 

tttggtttga tcaccatggc aacagtcctg tggctaatat acaggaagca gaacaggctg 1500 

ctgaatttaa gaaacggtgg acacagtttg ttaatgttac atttcatatg gaatccacaa 1560 

tcacaagaat tgcttatgta ttttatgggc catatgtcaa tgtttccagc tgcagattta 1620 

ttgatagttc cagttgtgga cttcagattt ctttacatgt caacagtact gaacatagtg 1680 

tgtctgttgt aactgactat caaaacctta aaagcagatt cagttacctg ggatttggtg 1740 

gttttgccag tgtggctaat caaggacaga taaccagatt tggtttgggt actcaagaaa 1800 

tagtaaaccc tgtaagacat gataaagtta atttcccctt tgggtttaaa tttaatatag 1860 

cagttggatt cattttgtgt attagtttgg ttattttaac ttttcaatgg cggttttacc 1920 

tttcctttag aaagctaatg cgctgtgtat taatacttgt tattgccttg tggtttattg 1980 

agcttctgga tgtatggagt acatgcactc agcccatctg tgcaaaatgg acaaggagct 2040 

gaagctaagg caaatgagaa ggtcatgatt tctgaagggc atcatgtgga tcttcctaat 2100 

gttattatta cctcactccc tggttcagga gctgaaattc tcaaacagct ttttttcaac 2160 

agcagtgatt ttctctacat cagaattcct acagcctaca tggatatccc tgaaactgaa 2220 

tttgaaattg actcatttgt agatgcttgt gagtggaaag tatcagatat ccgcagtggg 2280 
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cactttcatc ttcttcgagg gtggctgcag tctttggtcc aggatacaaa acttcacttg 2340 

caaaacatcc atctacatga aaccagtagg agtaaactgg cccaatattt tacaactaat 2400 

aaggacaaaa agcgaaaatt aaaaagaagg gagtctttgc aagttcaaag aagtagaata 2460 

aaaggaccat ttgatagaga tgctgaatat attagggctt taagaagaca ccttgtttat 2520 

tacccaagtg cacgtcctgt gctcagctta agtagtggta gctggacatt gaagcttcat 2580 

ttttttcagg aagttttagg aacttcaatg cgggcattgt acatagtaag agaccctcga 2640 

gcttggatct attcagtgct atatggtagt aaaccaagtc tttattcttt gaagaatgta 2700 

ccagagcact tagcaaaatt gtttaaaata gaggaaggta aaagcaaatg taattcgaat 2760 

tctggctatg cttttgagta tgaatcactg aagaaagaat tagaaatatc ccaatcaaat 2820 

gctatctcct tattatctca tttgtgggta gcaaacactg cagcagcctt gagaataaat 2880 

acagatttgc tgcctaccaa ttaccatctg gtcaagtttg aagatattgt tcattttcct 2940 

cagaagacta ctgaaaggat ttttgctttc cttggcattc ctttgtctcc tgctagttta 3000 

aaccaaatgc tatttgccac ttccacaaac cttttttatc ttccatatga gggggaaata 3060 

tcaccatcta atactaatat ttggaaaaca aacttgccta gagatgaaat taaactaatt 3120 

gaaaacattt gctggacact gatggatcat ctaggatatc caaagtttat ggactaaatg 3180 

ctgcaggtcg gcaaaatttg cactaatgtg tcccaaccta ctttgtggat atgaactaga 3240 

aaactttgtt tattcttgta catgtatgta tgtgtgtaga gtgagtgcgt gtgtccagta 33 00 

tgttatttgc acagagatat tttcaaaata ggcaccatat ttggcctagc aggatttatt 3360 

tttatgttac cacttttctt gcctttgttt ctgaattttt ttctgctaaa atgtttctgc 3420 

tacagaggta tatattctgg ggttctgaaa tatggggttt taatggactt taactcaact 3480 

tctttggaaa ctatttatct atcttaggac ctcaaacact acaaacggcc ttgcaattgc 3540 

tgctgtatct agtcatctct cgcctcttaa tatggactac aaaactttat gttttgaaaa 3600 

cgtctaacat ttaccttgca cacaaaaacg agaaataaaa aaacaaaaat c 3651 

<210> 21 

<211> 2205 

<212> DNA 

<213> MUS musculus 

<400> 21 

acatcctctt aaataatctt accaaggaat aatcaggaac agtcacgctt ctgtgtccct 60 
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ttctgttttg ctaaagctaa gcttatgtac caagttagaa catcaatgac attaaatgtg 12 0 

gagacctttg ctactttttg ttagagggca tccctatatt tgcttgcttt tattttaaac 180 

cgtggaaatc tgaatccaga tagaaacaaa attagggttt tttccatacc acatgctagc 240 

atttgcactg attttcatag gaaaaaaaac ttcaaacaga acaaaataaa aacatgtagc 300 

ctatagccat tcctatttaa aatgattggt ttccttggct aggtaaaatt ctgcatgatt 360 

aaattgccaa taattctgac atttgggttt ttgcatagat tttccaaaat ttaggtccta 420 

agttgttatg gtaacttttt tttaaagaaa gtttaatttt aaattacaga tggattttgc 480 

tgggcattag caatttgtgt ttatttagaa aatagagtgc tcttattttt gtaaatgtct 540 

cacggaaata actaaatttg tttataaatt gagactacta aagcacaatc gttgaagcca 600 

tagagaacat cttgaaatac agttttaagt ggagaatttt aggaaactta cataatatca 660 

taactcaaat atatttaaat tgcaattctc tcagccttta tactcatgtg ctgtatacac 720 

agttactcta aacaatgtaa gagacatata cagtagcccc tagagttatg aatttttaag 780 

tcaattaatt tccatgaaga aaattgagaa tggctgttta tgtctgtata tggtgtgatc 840 
ctctagttgg tgcatgcatg tgtgcatgca tgtgtgtatg tgtgcatgta cacgtgcgtg 900 
tgtatgtttg tgtgtgtgtg tgtgttgagt tttctcccaa tccttgtaat ataggaaatg 960 

aacacttatc caaatgttga gagttcattc acaccgcatc tgagttacta ggtcctggga 1020 

cagtggatat aggtattttt ctcttttgtg gccaatttat ttaaatataa aacaatggtg 1080 

ttagtcttag taggagttta gagtgacaaa gacttaaaat ttcctttgag agtggtattg 114 0 

ctcatgccta gcatctttgt gtatgtgcag aaaaggagag tagtgttagg ggctgctgag 1200 

actatgggga gaaatgatga tacattgaag agctaggtct agggagagaa atcaaaatac 1260 

tcttgaaaga taggaaaaca ttgacatagg gctacctcat atttttttta tttatttgca 1320 

tgaaataaaa ctagaattat aaaattcaca ttctcaattg ggatattata tttgttaatc 1380 

cataaaacta tttacattgt atgtggcaag ttgtagtcat ttttaagagt tagactctta 1440 

ttgcttccaa ccaagaaaaa taaatgaatt cagtctagaa ttggcaagag taatgaagta 1500 

ataattgtaa aaattgttgg agtatgtctt cctgagaatc atagtctcct gtatagcttg 1560 

actggccttg aagaacagtg gtgacaacag agcctgatgt ctgttgcacc tttgagtctg 162 0 

tcattatttt atagacaaga tctgaagctc tgataaccct ggctaaaaac atttttaaga 1680 
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aaagaaagtt actttaatat tatatattat tgttttactc atcaatagtt attgcttatg 1740 

gttatattta ttcaatggaa aacactgtat ttgtattgga tacaaacatg gatttgatat 1800 

ccttttcttt aaatatatta aatgaaatta aaaaacatat ggggctttcc ttggggttga 1860 

tttttcttcc cctataaaat gagtttctgt ggttcagaga caacctgaca gatcaactaa 1920 

aatatgaaac tgtgagtctg tgatctgaca gatcaactaa aatatgaaac tgtgagtctg 1980 

tgatgcttag ggcttcttgc acctaaagta ggaattatta gaaccagttc tttgttcaat 2040 

gcatgctatg gtttgactaa tgtcgaatgg ttgctgtgca tcagaaaagg aagatattat 2100 

tgagcttctc tcagtttgaa gaaggcaagt gaagtaccaa gcaggtatta gcagagttat 2160 

ttaaagctgc agactttaca ttagggaaca cactcagata aaact 2205 

<210> 22 

<211> 4059 

<212> DNA 

<213> Mus mUBCulus 

<400> 22 

aaatatatat ggaagtgatt ttcttaaacc cttaaaagac caatcaaatg tggtgttact 60 

aattcttatg taacagatta actccaactg ctagaaggta acatgaaaaa tgcccagctt 12,0 

gattggcatg tctgttgtat ttcttgacag cttgaaatga tgtgtgttga gaaatgaatc 180 

tttccaggat ggtgctttca tttagtgcag tctactctag ctggttgctc tgaattttag 24 0 

tcccagacca aagctggtag gcaaccttat actttgccac tatggcaacc ccgcattgct 300 

ggtgttacct gctgggtaag cactaaagag agagccgtca ggccgctctg cccattccat 360 

ccctaactgt gcttcttcac cctgcctcct gtgctttctc tgatcaacca tcgacacccc 420 

cttccttctg ctctgccggt gaccacagaa gtctggaaag caagtgatga ccatgagcat 480 

atagtgaaac agcctcacag gcaggttatg tttcgttctt ctgaactcag tccactaaca 540 
cacaccatgg ctttaggtat ggaagaacat gtctaacaac tccctggacc agggtgtttt 
gctactgcag tagcagtata actaaatgaa tgactaaaat gcccagcttg atttaactgc 

ctgtccagtg cttcttgctc agctggggct ctctcctctc actccatagg gagaagaaaa 720 

accaataagc tctcaccttt tccacttggt agtctaaagc ttagcttgat ccccatatgt 780 

aatttcttcc ttcttgtctc aagaagtctt taatctattt ctatcattct ttaaatcctg 840 



600 
660 



39/186 



wo 2005/005597 



PCT/US2003/027106 



actagtctga gttatttctg tctttacaaa gtttcactac ttgttgctta aaggataaaa 900 

atctctgctc atcagcatag cagcaataat tcctcatgcc tggccttttg atttctttgt 960 

cccatgatga tcttcagcca gttccagatg gtcatctcct ctctgtagaa ggaacccaca 1020 

tgttcaaggt cctactcaga aaaggctctt cagttgccaa gatgcactgg atcctacaca 1080 

ttatagttat tcctgtgtct ttgttcatcc attaatgtgg tgcttataat agtgggtatt 1140 

agtatggtca tcatcattga cttcctagct gtccaagtcg caagaagaga ttgtctgcat 1200 

gagtttgtat ttcctgagtt gttccaaact ttttggaata cattttttct taaacacaaa 1260 

tagtctcatc cttccacaaa ggctctgagc tctactcata gtttgtctct taaccccaag 1320 

ctctctgggt tggtcaagca gtcacatgtg ccatcctact agccatggaa tgaatgactt 1380 

tgatatcaac tcaaaaatct atgaccagtg aaagccaaag acacttggtt aggagactct 1440 

gatttaacct ttcagggaaa ctgaatgtaa gaaagaaaga ctatgaccca gctccggtat 1500 

ctagtgaacg gagagacaga aaagctgccc ttggttcttg tgggtccccg gaatgcaccc 1560 

cagtagcctt ctgctggacc cttttgtatg taggtcaaag gattggcttg gttctttgca 1620 

acagcaagga ttctaacgtc tgtaggtatt ttcctttgag gcttgagatc tattgggtgt 1680 

agaagctcag tccacaccct gttcaaaatg gtgtaagact tctaaattcc cttgagtcaa 1740 

ggcagaactg tggccttaca agaataaagc ctatcttcgg tgcctaccta atatccctaa 1800 

ctaggctctt tgaccttggc cctttgtcct ataactccct cacttagaat gctttctctt 1860 

ctcttcttac ctaactacaa cttgtacctc agaacctacc tcaggtgacg catcatcagg 1920 

aagaccgctc ctcataccct aagtcagccc atgtcacacc actgtcctac ttgtctttgt 1980 

tctctgctac atggtaactc ctcaagggcc aaaaatggtc tcattttagc tttgcacttc 2040 

tagagccaaa cagtccttgc cccattacag gcttgctaaa tgtttataag tgaatggatg 2100 

gggccatcta actcagtagt cattttcctc ttgatggcga cactgttcag agtgacatct 2160 

tgtccagcat catgtagcca ttttaccttt agtttactaa agagaaagtt ttaaagggaa 222 0 

taatatcttt agaggagaaa attaatgctt tattttttca tttaagtgaa tacatatata 2280 

gtaaaactga acatatgtaa agtagaattc tctttaagtg taccgtttgg tataaatgga 2340 

tgtacccatg tcacctcttt taataattga ggtagccgaa tgttcccctc acctaagaca 2400 

gtagatccct atatgccctt cctattcagt ctcctgcaca ctcacagatg gggactccaa 2460 
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accgtcacag aactgtggct agccctaaag actagccttt cttctctaga ggttcaactc 2520 

tatactcttg tgtctggctt gctctacttg gcataatgtg ctgggattca tactgtggca 2580 

ttggtagttc actgtcgatt gtggctgaat acatatcttc atagaggcat gcatcttcct 2640 

tcccctgagg caaatgccaa gtataaccga tactgtgagt catatatccg tacttgttta 2700 

ccttcctaag aagtcttcag gttcttgtgt gaagtgattc caccatttac actcccacca 2760 

acagtggagg agttgcagtg gttccccgtc ctggttgtca atgtattaat taagttaata 282 0 

ggtgggcatg ttttattttt tctgttattt cccccgttag atagaacagc accttattat 2880 

gcttggtggc tgtttataca taagttatta ttttatagtc tactctttgg tccatttaaa 2940 

tcagttacct tcttattaca ttttaagact tatttattgt ggatatagca tagaaatccc 3000 

attatgcatt ataccttatt ttttattaac tatacccttt ttctttttaa tttttattta 3060 

aaaatttacc cctcactcat atttacataa aatttattat tcccccccgc ccccgcacaa 3120 

cactaattac ctttttcaac agcaaaagct tttatttggt ggaaaaaaaa ttgggcattt 3180 

ttttttttct tttaaagggt ttttgtgtac tatctataaa atccttgtgt aactcagtta 3240 

cacatacttt ccccatgttt tcctctataa attttataaa tttagttcat tcatttagag 3300 

ctatgtttct ttttctttag ttcctttaag gtatgtgtga ggttaagggc taaggttcat 3360 

gttttcactc tggtgcccac ctttgctaag caggccacct ttccatgctg catcacgcct 3420 

ctataaagtc aggaggaggg actcgctggc atctgacttg acttcatttc ctgtggattt 3480 

gtgtctgttc tcactgcaat agtatagcat cttgattttt gtggatttag agcgagtctt 3540 

gggattgaca gtacaactgt tcacttaggt caaaattatt taagctatgt cagtattatt 3600 

gcttataact tttagaatca gcattttggt tttttcacaa aatagtttgt taggattttt 3660 

ttttaataga cagtattagc cctataatag gcttagggac aattgatatc atagtattga 3720 

gctatccaat ccatgaacac agtatatctt tttatttact tagctctttt aaattttcac 3780 

tcaataatat attttagttt tgagtgggca gttcttgcat tcattttgtt ttaaaatatc 3840 

cctagatatt tcacatttct gttattgtaa attatatttt taactttaaa tttccagtgt 3900 

cttgttaata tagacaaata caactaaacc ttttatattg accctgtatc ctaaaatctt 3 960 

gaaaaactta atacttctgt tatctttttt ggtagcttcc ctaggatttt ctacatatat 4020 

aatataccaa ctgcaaataa agattatatt gcttcttcc 4059 
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<210> 23 

<211> 1496 

<212> DNA 

<213> Mus musculus 

<400> 23 

gattctgagc aaacacggac tgccacaacg gaggtcctag ccaccttctg atattgactg 60 

tgaccactgg atacagaaat ggctaacaat ttcactaccc cactggcaac gtctcatggc 120 

aataactgtg atctctatgc ccaccacagc acagccaggg tattaatgcc tctgcattac 180 

agcctggtct tcatcattgg gctggtggga aacctgctgg ccttggttgt cattgttcaa 240 

aacagaaaaa aaatcaactc aaccactctc tattcaatga acttggtcat ttctgacatc 300 

ctgtttacca cagctttacc cactcggata gcctactatg cgctgggctt tgattggagg 360 

ataggtgatg ccctgtgccg ggtaactgct ctggtgttct acatcaacac gtacgcaggt 420 

gtgaacttca tgacttgctt gagcatagac cgcttcttcg ctgtggtgca cctctgcgct 480 

acaacaagat taaaagaatc gaatacgcaa agggtgtctg cctgtccgtc tggattctgg 540 

tctttgctca aacactgccg ctgctcctca cccctatgtc taaggaggag ggagacaaga 600 

ccacttgcat ggagtatcca aactttgaag ggacagcgtc cctgccgtgg attctgctcg 660 

gagcctgtct gctgggctac gtgctgccta tcacagtcat tctcctgtgt tactctcaga 720 

tctgctgcaa actcttcagg actgccaagc agaacccact caccgagaaa tctggtgtga 780 

acaaaaaggc tcfecaacaca attatcctca tcattgtcgt gttcatcctg tgcttcacgc 840 

cctaccacgt ggccatcatt cagcacatga taaagatgct ctgctcccct ggagccctgg 900 

agtgtggggc gagacattcc ttccagatct ctctgcactt cacggtgtgc ctgatgaact 960 

tcaactgctg catggacccg ttcatatact tctttgcatg caaagggtat aagagaaagg 1020 

tcatgaagat gctcaaacgt caagtgagtg tgtcgatctc cagcgcagtg aggtcagccc 1080 

ctgaagagaa ttcgcgggaa atgacagagt ctcagatgat gatccactcc aaggcctcca 1140 

atggaaggta aaggcacttg ggacttcaca gcacagcaag ctgcgggatg ggccccgccc 1200 

accgactggt cggctcccaa caaagatgcc ttccactgcc gccccaccgg ccaatgcact 1260 

gagatccaga ccagatcgag gagacaaaaa agcaagttca acttcataaa tgaaatataa 1320 

tgtatataaa ggaaggctct cataagtctc aatgtaaaaa gaaattcttt gtgaaattac 13 80 

tatttcttgt caatagtttg gcaaaagacg actaattgca ctgtatattg ccagtgtaaa 1440 
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aatgttaata ctgtaatata tgaatatatt tcttaattta cacctctttc aatttc 1496 

<210> 24 

<211> 1341 

<212> DNA 

<213> Mus mus cuius 

<400> 24 

ggtggcagct tttctacaat gaaggctgga aagaccttgt gagaaatgag agacaagtga 60 

gattcctctg ctgcatgttt gcccacatgt ggttcctagc tcggtggcgg aggggcactg 120 

ggtcgtcatg tcaaaacggg ccagcttgct gccattaagg ttatggatgt cacaggggat 180 

gaagaggaag aaatcaaaca agaaattaac atgttgaaga aatattctca tcacaggaac 240 

attgctacat actacggtgc ttttatcaaa aagaaccctc ctggcatgga tgaccaactc 300 

tggttggtta tggagttctg tggtgctggc tctgtcactg acctgatcaa gaacacgaaa 360 

ggcaacacat tgaaagagga gtggattgca tacatctgca gggagatctt acggggcctg 420 

agtcacctgc accagcacaa agtgattcat cgagatatca aagggcagaa cgtcttgttg 480 

actgaaaatg cagaggttaa gctagtggat tttggagtga gtgcccagct tgaccgaact 540 

gtgggcagga ggaacacgtt catcgggact ccctactgga tggcaccaga agtcattgcc 600 

tgtgatgaga acccggatgc cacatatgat ttcaagagtg acttgtggtc tttgggaatc 660 

accgccatag agatggcaga aggtgccccc cccctctgtg acatgcatcc catgagagcc 720 

ctcttcctca tcccacggaa ccctgcacct cggctcaagt ctaagaagtg gtcaaaaaaa 780 

ttccagtcat ttatcgagag ctgcttggta aagaatcaca gccagcggcc agccacggag 840 

cagttgatga agcacccatt catacgagac caacctaatg agaggcaggt ccgcatccag 900 

ctgaaggacc acattgatcg aacaaagaag aagcgaggag aaaaagatga gactgagtat 960 

gaatacagcg gaagtgagga agaagaggaa gagaatgact ctggggaacc cagctccatt 102 0 

ttgaacctac caggggagtc aacactgcga agggacttcc tgagactgca gctggccaac 1080 

aaggagcgct cagaggccct gcggcgccaa cagctggagc agcagcagcg ggagaatgaa 1140 

gaacacaagc ggcagctact ggctgagcgc cagaagcgca tcgaagagca gaaggagcaa 1200 

aggcggaggc tggaggagca acaaaggcga gaaaaagagc ttcggaaaca gcaggagcgg 1260 

gaacagcgcc ggcactacga agaacagatg cgtcgggagg aggagaggag gcgtgccgaa 1320 
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catgagcagg aatataagcg c 



<210> 25 

<211> 2368 

<212> DNA 

<213> Mus mus cuius 



<400> 25 

tcttactatt aagtatgggg atgattctgg tttttacaaa tgtgttagtc agattaaaga 
aattgccttc tgttcctagg atttagagtg tttttattgt gaagcagttt tgaattttac 
cagaggcttt tttaaaacct gattttgtgt atgcattcat gcccatgtct gtctgtctgt 
ctgtctgtct gtctgtctct ccctctccct ctctctctct ctctctctct ctctctctct 240 
ctcacttgta tatgtgtgtg catgtgtttt cttttttatt gatatggctt attgcattga 300 
ttgatttttc agttatttta tttatttgtt attttatgtg aatgaatatt ttgcatgcat 360 
gtatgtctgt gcaccacctg tgttactggt ccctgaggag gccaaaagaa gtagtcaggt 
ccctcacact agaattgcag aaggctgtga accgccatcc tgagcttagg gtttaaatga 
ggtcttctgg aaaagcagct gaaccacctc tccagccctt gatttttagg tttgttttta 
attattattt atttttattg gctattttat ttattttaat ttcaaatgtt atcccccttc 
caagtttccc ctctaaaaac ctccctccac acccgcccta cctctttgag ggtgctccca 
cacctaccca cccactccta cctcagtgcc ccagcatttc cccaccctag attcagctct 
taaactagct ttgcattcct ctaataaatt ccatttcaac atgccttaat tatcttcatg 
tgttgatgaa tttgttttgt taattttttg gtggtaattt ttctatgttg catataggtt 
aaattgttct atcgtttctt tttcttaatg actttttttt tggtttgact ctgatagcac 
agtgatatta gcttcataaa atgtgttgat gtatccccat ttctaatatt tggagcagct 
tctgagcatt tttcaatttt tctttaaaca tttttgaaat ataacgatca aagatatttg 
gtccaggatt ttctttgttg atagttttat tattattatt attattatta ttattattat 
agattcaggc tctttcctaa tgctttatta cctatttttt cctgatttag tttcagtgga 
aactagctta tttcaatatt tttatagtat ttattaagcc ttctcatctt tcatctggtt 
ctttctataa tgtgtcattt tctccatctt catttaaatc tttctatagc ttcaaacctc 
aaatgcttct tattggtatt atgactgatt atttttatgg acttatctaa cttgctgaac 132 0 
tatgtgtaga gtagcataag gatccagtta ctgtatttat ttcccgtacc ttttgtgtgt 1380 



120 
180 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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atgtaggtgt ttgtctgaac gtgtggtctt tgtaccacct tgtgcagtgt ttacagtcac 1440 

ctatgtatgg ttaaggttct cggggactag agttacagat agttgtgagc cactgcatgg 1500 

ctgctagaaa ccaacttcgc tcctctggag gagcagccac tgattcaggc cctaaccagt 1560 

atttctgttt tgttgtttta ttgccactta aaagtttgcc ccacttgtga tcttacattt 1620 

ttcaatttaa aaaattacag cacagatagc tttcaattta taatcagttt gaaagtgttt 1680 

caaatattat ttcataaaac attttatgat actgggtatc acaatcttac ccaaattaaa 1740 
caaaatatgt ctttaggaaa agaagaaatt acaggccaat ttatcctaaa aaaaaaaaaa 
aaacaagaac agaacacttt ttaataaaat actgacaaaa ccaaaatcat cactatatgt 
aattaattaa tggcatgcag aagttttttt caaaggaatg gaaggttaga cctgaaagct 

ggtaaaaaga atgcattgca ttctaaattt tactcatcga tattgaataa aattcttaca 1980 

ttttgttatg tttctatgaa gaaaaatctc ttaaaaatag aacgataaca ctaaagctat 2040 

ttaaatgcaa tgttagggtt ttgttaatgt agagaaaaca attttcattt tcaaatgttt 2100 
gggttatatg ataatgtgta gagttcttta gtgatgacat gatttttttt tttcttgagc 
tacaaaaagt cccaaagtag aaattaagtt taaatttttc taaatcctta attaattaaa 

aaaaaagttt catgctgagt gtggtggcca aaggcaggtg aatctctgtg cgtgtagggc 2280 

agcctggtat tcattgcaag taccaggcca tccagtgcta catagtgagc ccctgtcaac 2340 

caatgaagca gagacaaaga aacaaacc ^^^^ 

<210> 26 

<211> 1941 

<212> DNA 

<213> Mus musculus 

<400> 26 

aagctagcct tgaactcaga agtctgcctg cctctgcctt tcaagtgctg gaaataaagg 60 



1800 
1860 
1920 



2150 
2220 



120 



tgtgtgccac cactgcctga ccataaaatt actttttatg tagaattttt tttttttttt 
tttttttttc tgagacaggg tttctctgta tagccctggc tgtcctggaa ctcactctgt 180 
agaccaggat ggcctaaaac tcaaaaatcc gcttgcctct gcctcccagg tgctgggatt 240 
aaaagggtgc gccaccacca cctgactaga atttttatat atactttgta tatagaaata 
ctttttatat agatggtcat gttctgtata ggctatctag cagtcttggt tcacaacctt 



300 
360 
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tcctgaattt ttagggcttt cttgttgtga tctaatgagc ttctgtgctt atttcagagt 420 
aacaatcttt gagggtagca acttgtaaat acagaatgtt tagcctagat tactgtaaaa 480 
attaaatgtt ggatgaattt tgaataggtt tgaacgaact atttacttat gaaggaaaca 540 
ttagttaatc tttaccactc ctgtttagct ttcactaata aagaaactcc tcaagtcctc 600 
aggtaatttt catgctcact gagtgctcag tacctgattt cactggtggt ctaagacttt 660 
taccctgagg agttatcata tcctcagtta atcaggaaac tgtgtcttaa gtatttgtaa 72 0 
tgttgatcat ctatttttca atttacccga tcattatcaa taaactgtta atgtacttga 780 
tgataaggtg tgattacttt atttcctata gtgtattctg tatctctgta cttcccagag 840 
tcagatcttc ccaattcatc ggttgtttat taagccctta actgttttta tagtgttagg 
ctatttgaat ttagatgatt taggtagttt gccataaagt atgggcatga tatccctgtt 
cttttgatcc tcctattgtt gatgtactct gtaagccttt gtctattgtc tatgtttgta 
aataaggctg tattttaaat gtgtagtttt tttttctctc tccagaagga tctgcttttt 1080 
catttagctg aaagtgttta aaaatcatgt ctgtctgtaa agatgacaac agctcccagt 1140 
aacacagaag cctgtattgt gtgagctata acttggaaga atttcagata tacaatgtcg 
tagtgatttt ctataacaat tttttattta aaagggagaa gaaactggct ttgtactctg 
tgaatttcag ctttgtgttg tctatacatt gctcctagtg ccttggtaat gctgactatg 
atgacatttt tgttacagtc ggcgaggctc tggcctgggc aagagaggag cagctgaggc 
ccggcggcaa gagaaaatgg cagaccctga aagcaaccag gagacagtaa attcctcagc 
tgcccggaca gatgaagctc cccaaggagc tgcaggtata ctgactggca cttaaaacac 
acatatattt ttgttcgttt cacaaattta tctttggatg aattttcttg ttctacatcc 
taagtaggat gaaaggaggg gagagaatta agaggttacc atgaaacact tttattttag 1620 
gtcatagatt gggtgctctt gatttgtggg ctcatttgtt gttaagactt aaactctcaa 1680 
gcagtccata catgtactac tttcagaggg atttatagta aggtataaat tttccattta 1740 
aggtttttat atattgctta gagttgacta atgattgttt cattgaattt aaattcataa 1800 
taaaaaagta aagatgtatt tgaattgctt tctaagcatg tagatcttag cattttatac 1860 
gccctaaaaa tttgttttgt tctgaagcta ctttagtaat atttagattt ttatgggctt 1920 
atttgatatc ttggattgcc g 



900 

960 
1020 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
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<210> 27 

<211> 1940 

<212> DNA 

<213> Mus nvusculus 

<400> 27 

aagctagcct tgaactcaga agtctgcctg cctctgcctt tcaagtgctg gaaataaagg 60 

tgtgtgccac cactgcctga ccataaaatt actttttatg tagaattttt tttttttttt 120 

tttttttttc tgagacaggg tttctctgta tagccctggc tgtcctggaa ctcactctgt 180 

agaccaggat ggcctaaaac tcaaaaatcc gcttgcctct gcctcccagg tgctgggatt 240 

aaaagggtgc gccaccacca cctgactaga atttttatat atactttgta tatagaaata 300 

ctttttatat agatggtcat gttctgtata ggctatctag cagtcttggt tcacaacctt 360 

tcctgaattt ttagggcttt cttgttgtga tctaatgagc ttctgtgctt atttcagagt 420 

aacaatcttt gagggtagca acttgtaaat acagaatgtt tagcctagat tactgtaaaa 480 

attaaatgtt ggatgaattt tgaataggtt tgaacgaact atttacttat gaaggaaaca 540 

ttagttaatc tttaccactc ctgtttagct ttcactaata aagaaactcc tcaagtcctc 600 

aggtaatttt catgctcact gagtgctcag tacctgattt cactggtggt ctaagacttt 660 

taccctgagg agttatcata tcctcagtta atcaggaaac tgtgtcttaa gtatttgtaa 720 

tgttgatcat ctatttttca atttacccga tcattatcaa taaactgtta atgtacttga 780 

tgataaggtg tgattacttt atttcctata gtgtattctg tatctctgta cttcccagag 840 

tcagatcttc ccaattcatc ggttgtttat taagccctta actgttttta tagtgttagg 900 

ctatttgaat ttagatgatt taggtagttt gccataaagt atgggcatga tatccctgtt 960 

cttttgatcc tcctattgtt gatgtactct gtaagccttt gtctattgtc tatgtttgta 1020 

aataaggctg tattttaaat gtgtagtttt tttttctctc tccagaagga tctgcttttt 1080 

catttagctg aaagtgttta aaaatcatgt ctgtctgtaa agatgacaac agctcccagt 1140 

aacacagaag cctgtattgt gtgagctata acttggaaga atttcagata tacaatgtcg 1200 

tagtgatttt ctataacaat tttttattta aaagggagaa gaaactggct ttgtactctg 1260 

tgaatttcag ctttgtgttg tctatacatt gctcctagtg ccttggtaat gctgactatg 1320 

atgacatttt tgttacagtc ggcgaggctc tggcctgggc aagagaggag cagctgaggc 1380 

ccggcggcaa gagaaaatgg cagaccctga aagcaaccag gagacagtaa attcctcagc 1440 
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1500 



tgcccggaca gatgaagctc cccaaggagc tgcaggtata ctgactggca cttaaaacac 

acatatattt ttgttcgttt cacaaattta tctttggatg aattttcttg ttctacatcc 1560 

taagtaggat gaaaggaggg gagagaatta agaggttacc atgaaacact tttattttag 1620 

gtcatagatt gggtgctctt gatttgtggg ctcatttgtt gttaagactt aaactctcaa 1680 

gcagtccata catgtactac tttcagaggg atttatagta aggtataaat tttccattta 1740 

aggtttttat atattgctta gagttgacta atgattgttt cattgaattt aaattcataa 1800 

taaaaaagta aagatgtatt tgaattgctt tctaagcatg tagatcttag cattttatac 1860 

gccctaaaaa tttgttttgt tctgaagcta cttagtaata tttagatttt tatgggctta 1920 
tttgatatct tggattgccg 



<210> 28 

<211> 2935 

<212> DNA 

<213> Mus musculus 



1940 



<400> 28 

tgtatctctt tcatcaattt ttctctgtgg tatagcaaat gaccacaact caggagcttg 60 

gaatagtatg tttattgttt gtgtgttttt atggatcaac tgggtctctt tattttttgt 120 

tgttgttttg ctttgtttct tgtttttttg agacaggatt tctctgtgta gccctgaact 180 

ccgtttgtag accaaacttt ctttgaactc agagaagtaa gcttgcttct tcctctcgag 240 

tgctgggatc aaagacatgt gctgctacca cccagcttca gctgagtagg tctcttattc 300 

aggcatcacc aggtggcctt acagtgttag ccaggctatg ttccccttct gagcttcaat 360 

cttcttccaa gccctctctg tctgctggca gattcctcat ggtctgaaag aatgaagctc 420 
cattttcttt ttcttttttt tttttaagat tttatttatt tattatatgt aagtacactg 
tagctgtttt cagacactcc agaagacgga gtcagatctc gttacggatg gttgtgagcc 
accatgtgct tgctgggatt tgaactccgg acctttggaa gagcggtcgg gtgctcttac 
ccactgagcc atctcaccag cccaaagctc cattttctag ccggttgtta acagaccatt 

cctagctact gtgttatgcc ctcctttggg agtgcatgtt agctgtttgt gtccttctat 720 

gccagcaaga gcagcctctg ctttggaagt gctcacctat gtagaggcag aaggatcgag 780 

gagttcaggg tcatcttcat ccacatagcc cgcttagaca acatgagacc ttgtgtcaaa 840 



480 
540 
600 
660 
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agaaatgaac aataggagct ggcaagcttg ctaagtaaat gaggatgttt gccaccaagc 900 

ctggcagctt gagctcaatt cctaaaattc acatggtaag agagaaacaa tttccataac 960 

ttatcctcgt tctgcccaat aaataacaaa tgcttttttg tttgtttctt tgtttgtttg 1020 

gttttttaaa atctttttaa aagcgctcat gtggtaggat tgatcagcga gggtaagttc 1080 

cactctgtgg gttggtagca ggactctgga gaagggcagc ctcgcacctt tgttagtcct 1140 

acttacaaag aaccaaaatc taagccccag tgaggacctg cttttctgtt ttttcttctt 1200 

ttcttttctt ttcttttctt ttcttttctt ttcttttctt ttcctttctt tctttctttt 1260 

cttttctttt tttttttttt ttttgagaca gggtttctct gtgtatccct ggctatcctg 1320 

gaactcactc tgtagaccag gctggcctcg aggatctgct tttctgatcc ctgtgatata 1380 

tgtttagtag ttcatagggg tttgtttgcg aactgcagtg tttgctaaat gcatcttcat 1440 

attttcctcc tgacccggcc tctgtttcat aatgccgaga tcgactagct cttttatagt 1500 

cactaatcca gttaggaatc ttctgtctgt cttccacaag ggaggaagag atgcccaaga 1560 

aggaggtctg atctaagagc agattatttg ttgtcactag gttgaaagag ttgtgcagct 1620 

atgcaagggt tgttgtccta gggggcattc ttagaggcgt agtcattatt ttgttgagtg 1680 

tttcctctgt gcctctagag ctgaaaactg aacctagggc tttgtacttg ctaagcaagt 1740 

gttctatcat ggagctaaat ccccaaccct tggtggttta gtcattagca ccatcttctc 1800 

tgaagcctgt gatagctcac acctaacttc ttggatggtc tttttctcag aggctcaggt 1860 

atgaggagct agtgtaatca cttagccaaa tatctctgtc tccgtagctc agggcagctc 1920 

tggacagaga gccttcaggt aaggagtggg gactgtgtga gggagacttg gctcaggagg 1980 

agctgggttc tggctgaggc tctgcttaag ggtagtccct gaacatgcac ttgtttctgt 2040 

attcactaac aatataatat ttccatttat ccaaggttct gaagactcag aagatggagt 2100 

agaaatggca acggcagcaa tagagactca agggaagctt gaagctagca gtgtacccaa 2160 

ttctgatgat gatgcagaga gctgccccat ttgcctcaat gcatttagag accaggctgt 2220 

gggcacccca gagacctgtg cccattattt ctgcctggat tgcatcatcg aatggtccag 2280 

ggtgagttgg cttttcagtg ggctactgct accccttatg actaggctgc cttctctgca 2340 

gcaaaactga acacagggct gagtgctgcc tgggttctca gcattgtggg gaaggaatgc 2400 

ttactgtggg ggctgttact gggtgagaaa ggatgactct ggcttttctg agggtcagga 2460 
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cacaaactct tcggcaacct cacatagtaa gtgtaaggca gcctctgtag gtattggaat 2520 

ggtgcaagtg tgcttactca gttctgatgg caagtcatga agagacaagc ttttatcacg 2580 

ctgcttacct tgcagtgaac tttaaggaga cttgcctggg ccttcaccca ctggcaaaga 2640 

cttttatatg tgtacaggta tgtggtgtgt gcctgtcttt acacgtagag gctaaaggat 2700 

gtcaggtatc ctgctctgtc gctttctgcg ttattccttt gagtcaggat cttttactga 2760 

agctgcagca taataggcag ctagtaaacc caagacattc tctcattctt catggcatta 2820 

gcattgagag atgaaggtat aatcttgccc aggtttttat gtggatgttg agatatgagt 2880 

ttgggtcctt gtgtttgcat aacaaatttt cttatctgtt gagccatccc catcc 2935 

<210> 29 

<211> 4090 

<212> DNA 

<213> Mus musculus 

<400> 29 

tacccaaata caagtaccag ggtggggagc tacaaggcct ccctgtgtga tctgggacat 60 
gaaggggcaa gggcaagggg attcctgaag ggaaggggtg gcattcaaat tctgaaagcg 12 0 
tggaagatct tattagcaga cagaggtctg ggccgggcga ggcagctggc aggggttctg 180 
ccatcattgc tggtccaggt aggcagacaa cagaagctct ggaggcagga aatccaggtg 
gcggttgctg actttcatct gcctccggcc ctccaggtca gcactggctg ttgtccagca 
tagcggtgta cacagcttgg ggggacaccg agaagaacgc cagcagccgt tcctcccaac 
tctccagggt ttccattggg atgcggctgc gcttgagcac ggaccgcagg cgacggctga 42 0 
tgcgcctgaa gcactcgtgg ggtgtgaaga cagggtcctc tctccgtcgg tcctccccac 480 
ggccaggccc acgtctccgc ttgccctggc tctcatcctc caagtaacgg agcactcgct 540 
cttgctcctc tcccgagcgg ttcataaaat cattccagac ctccacatag gtggcattgc 600 
tgcaggcctc tgcgaagatg ccaggtgcag caggagtcag gagggccccc ttccagtaag 660 
actggagaga gtgtgtcagt gccgtcgtcc atggagagtg aggaagccgc agaagtcaca 720 
gatgacccaa tggagcaaga ctaactactc agaaacactt agaggcagta ttttacatac 780 
agttctggtt ttacactgta taagactttt aagtaataaa gtggaccttt agttttacaa 
gagaaacagg ctgtaaaata aagaacctta agaataaacc ctgaaggttg tatgtggaag 
agctgtgagt acggctcctc tgggtcccgc ttagtgctga ttttcttgtt tggtttggtt 960 



240 
300 
360 



840 
900 
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ttttgtttgc ttttttcctg gtgtactcaa atcagtggtt tgtgtataga tttttttttt 102 0 

aatttagaat taaagttttt aaactggaag ataattataa ttttgaaaac tttaaagatg 1080 

atctcatttg gtttatacat acacaaggaa gatttttgtc ttgtctctag cagtttccat 1140 

attttggtca tagtttaaac atgttaacat gtgaattaat agggtttcat gtggtttcag 1200 

atttttattg ttcgactgta caatggaacc ttaagtcata tatacatata tagattatcc 1260 

tgagggggga tgttatattt ttctgtttct ataagagatg aatacagtgg atactttttt 1320 

attggtaatg actgagttca cttctttcag aagacatctt ctcttctgag tagttgagac 13 80 

aaaaatctgg cccctgtgag accctgggaa tctttcagtc tgttgaaata ccaggttaaa 1440 

cacattccaa gagatctgtg caaactaaat tcttttgtat acttctaagg tgcctgagac 1500 

aacaagactt catttattta tggaaagtgt gctttatctt gggagttgtg ctcaagcatt 1560 

agcttgctgt ctgtagaatg agtgcttcgg agcctgacat gaaaccatct caagagccca 1620 

caggtcccat aacattcggt tgcttttctg aacactgtca acatcacatc tgtctttctg 1680 

aataatccta tatgccgcat ggcactggtg tgaaacatca ctgtactgga agaattagta 1740 

gaggacttca gtaatgtacc tgagctaaaa tgaccgaggc gttaggggtg cacagaaacc 1800 

accatgattt gtataacatt ttgaagtgaa ttaatatttt tgaacatgct tcttcaacag 1860 

ccagtgttat atttttcaga tcaacacaaa gcacaatggt tactactcta taaactcaat 1920 

attttcaaat tcacatattt aaagtcatgc aagctgcaac ttccctgtca gaattactgg 1980 

ctgccaaatt tatacctgtt tcttcagctg tactttttga tatttagaat ttttaaattt 2040 

ctgtaaagta ggttttgtag actgtaatgt gttcactgcc tttgtgaagc ggtatattgt 2100 

ataatttcgt gtgtaactga atgcttgggc tttcaataca gtattcatat aaagcaataa 2160 

atattaatgt tatgaaatat tggagtacat ttttatcaaa atacaaaatc tcttttttag 2220 

tttcagacat ctgaggtaca gggatggacc taatagctga tacaaacagt ttctcacact 2280 

ttatctatcc taagcttcgg gtgatcagag ggaaaccaca aggtttgcat tttgactgct 2340 

tagacgttac tatggctaaa aagatatttg gctccgtttg ttcataaagt aatatgctac 2400 

tgactgatga ccttcaggtt cacagcagct ggacagtaga tttatgaatc tgtctagtaa 2460 

aactgtcgat ttactgtacc caaaggggtg aaagtcaaat gtaacttcaa gttttttggc 2520 

aaaaagatta aaatgaagca aaatttaagt gtgacattca tttgtaatgg cctgttagag 2580 
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ttgagggatt gacagtaaga gcagatttta aacattttag gtttagagtt tttgagattt 2640 

tccttaggat atttatgagg ttgttgtcaa taaacctgtt ctctaagctc ctgtgacctt 2 700 

tgatagtctt tatttatggt gccatggacc atttacaaac aagtcagttt gctgttggtt 2760 

aaaagttgaa gatttcgcat catgaataga ggtctgtggc tttatttgta aacttgcaat 2 820 

tgctatcttt gcaaggggaa gtgtatttct ttattaaata aagtacaatt aataatggtg 2880 

aatgtaccaa aatgacatca ctcaattcta tgagaggtct gcattttaac ctatagttta 2 940 

atagctttaa tatttattag ctattcctat gttgatcata gatgaaagtt gttgctgttt 3000 

atacagatac acgtaggata ctggtgaagg gtctctaggg cagttgtaat attcatcacc 3060 

gtgtgaggcc atgttcagac tgtggatgca ttggtccctt ggaagtccat ttccacacgt 3120 

ggtgttagat gcacttacag gagactgcag gaaagtgtca gttctaatga gcactgagtc 3180 

ctttgtgagt gacagaatga gacccaagag ggagggtcaa gccgtcccgt ggctgaagta 3240 

gctggctctt tgatgttgaa cagattccta accgggttct cctttcccaa gcctaactca 3300 

gatctgggca gtgctgatgg tgctgacaga atacacatga catggttttg ccacccctcc 3360 

cttttaaaaa gtgaaaacat tttgaaaact ctataaagtt ctgtacatgt agaacagaag 3420 

tgagtagtga aaatatattt tgaggattag taaacttaat ccacttaatt gtcacaactc 3480 

cggtctttcc catatgtagc cagagcaatg gagttacaac tctggctttc gaaagctatt 3540 

ccagaaaccc tgccccagaa agtcttaaag cattgagatc cttgtgtttt attttggcag 3600 

tgtagatagg catgtattta tgcatttgta aaatcaattt ttttcaaata atgtatgtaa 3660 

tgtactagct taaacggtac tgggcagagc ctagagctac tgcgaggatt gaatgtgaag 3720 

ccggtatcgc gggtggaaat gtacctgcag agctacagca aactattccg ggtagtgttt 3780 

caggctgcct ttgagcagga gtttcttaga tctattggtt ttgacaaact gaagatcagt 3840 

tcttgagatt tgtgttaatc atgagatgaa tggatgcaaa aaaccccttg taatttcatg 3900 

tggaattatg aaaattagct tgatgggata ttggcctaac aaggatgatg gtatgtactg 3960 

gctagaatac atatatttca catataaaaa ataatccgga caccagaatt ctcttctttc 4020 

aatttggagc ctagatcgat cactttgcca aataaatgta ttattttcat aatgcaataa 4080 

agtgtaaact 4090 
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<210> 30 

<211> 3272 

<212> DNA 

<213> Mus musculus 

<400> 30 

tttttttttt ttttttttgt ttcaagacac agtggcccag gttggcctag aactcactat 60 

gtagcctatg ctggcctcaa attcacaaca ctcttcttgc ttgtctcctg aatgctggga 120 

ttgcagggct tgttcatcat gtctggttta tgtagtactg gaaatagaag gcagggttcc 180 

atgcacacta ggcaggtgct ctgctagagt agtggatcaa tcgcagtgac tgcatctggg 240 

gagtagtgga tcaatcgcag tgactgcatc tggggagtag tggatcaatc gcagtgactg 300 

catctggaga gtagtggatc aatcgcagtg actgcatctg gggtacactg aagctgtctg 360 

gggacatttc cagttaacac cactgagaat gatcaggctc cagcaggtga ggcaagggac 42 0 

gctgataagc atcttgatgt gcaaagatgt tgcttacaat gagctagcat ctggcccaaa 480 

tgtgggaaag tacaaaggct taaagtagac tgaagtctcc acgtgcaggg gatctgtgaa 540 

gctttgctgc ctgctgccag ttggtgctaa ttacttgtgt gtcactgtaa ggggaacctt 600 

gagggaaatt gtgtgggaag aaagctgttc tttggctctt catttcagag ggcctcaatc 660 

cagtcctgca ttgaagttag cctgagtgcc ccctgctgtg acctcatact ccaggcacct 720 

gggactgggg tggtcagatc agggataaga tggccagttc tgtctgattt tgactttcag 780 

aaaagtgaat gaagagttat ctagtgcagg tgtgtcccac accagtgaag gacatctgtg 840 

ctgttactcg gtaggatcta gcagtcgcat cccaaagcaa ttaagttact aaccagagta 900 

ctgggcagtc acctgaaaca cattcttatc cttagaaatc ttccaagcga accccccaga 960 

caactctggt tactgatagc tgtccataga ctgataggca gtctcccttg ctgaagacaa 1020 

cacctacata actcactgaa cacggagagg tcatgctggt gcctttggcc ctgcagacta 1080 

gtgtccgtgg ttctgggagg tactctgcat gctatcagag gagaaaggtg aacaccaacc 1140 

cagatacaaa tctttttatc tgcaatggtg accttcctgc atgagatgct ggggcaatgg 1200 

cggcactaag gttgtagaag taacccacac tctctggatt taaagaactg catgggatag 1260 

agcccatgcc tcacgctgac ttggtggcca ggaacctgag actacataaa ccatgaccta 1320 

gggggaaact attgttctgc tcaagggcta tagcaataca acaattccca atgtcactct 1380 

gctgtactca cagatcagca ccttgctcag ccatcatcag agaagcttcc ctctttagta 1440 
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1740 
1800 



gatgggaata aatacagaga ctcaacaact ggacattgtg cagatagtga aagacattgg 1500 

aacactcagc cctaaatggt aacaggaact atgcagatga ggaggtggag ccagagggga 1560 

tggagggctc caagggaaca gtgccttcca ggcccaacag aactgatgca catatgagct 1620 

gacagactgt ggcagcgcac agggcctatg caggtccaaa ccagatggtt ctcagtactg 1680 
aggtggggga aggagacata agctctcatc cctaacccaa agctataact aacaactgcc 
tatgaaggaa aaaattaatt tctccaagag tatcttactg gatatacaaa tacaattaag 

ggcaggtccc atgctcagga aaacatggtg aacaccacaa gtaaactctg tgtgtgtgtg 1860 

tgtgtgtgtg tgtgtgtgtg tgtgtgtgag agagagagag agagagagag agagagagag 1920 

agagaagtca tggatagccc tggtgttgtt tgtattttga tgctaattcc gattccctaa 1980 

gaggggctgc ctaggaggag gagtgaatca cctactcagg tgacttcatg tgaccattct 2040 

aatgaataaa gatttcaagg aaaccattcc ctggacgggt aggcgggtct tccaggtcag 2100 

acgggcatgg cgggggtggg gtggggcaaa agaggagagg agagtgggct ggaggacagg 2160 

agcataaaga cgaaaatgta ggtagtgaat ccccgctccc ccaccccctg tttcaggtgg 2220 

cagatctgtt tgggtcagct accagaggat ttaacttaga atggctgata aattaggatg 2280 

ttaattgttg tgcccagtga ttgagttacc gttgattctg aactaagttt gtgtggtgtt 2340 

ttctttcact tggcggctca actgggttcc ggagagaaaa ggtacagtga tgtggaatcc 2400 

cagccagcca caggaatttg gaagtgtgga gctggcatgg cagcttaaca agcagggtgg 2460 

agagctccag gagcagagag tctgctgaga agaacaaggc ccgccagtgc cttgctggca 2520 

atagcatgga tagtttcttt ttacatttcc tgctgtgtgt atgtgtgtgt gtctatgtga 2580 

gtgtctctgt ctgtgtctat gcatgtgtct gtatgtcttg tgctcttttt tagtttcttt 2640 

tttgtttatg tgtcttcgtt ttgttttaac cttgaatgct tgtctttttt atatgcctat 2700 
ttttttttaa gagagaaaga aggtgtgggg ttgaaagggt ggagaggtgg aaaggatctg 2 760 
ggaggagaca agggaaggga aaaccatggc cagaatatat cacatgaaag taactttatt 
ttcaattaaa aaaagaaatt tcccattatg gttatgaggt agcaatgaac actacggttg 
ggggtcagca catgaggaac tttgttaaaa gactgcagca ttagaagggt tgagaaccac 2940 
tgtcctatga acttctggct gtcctcatgt tcctccaccc tgagaaatcg caatactgct 3000 
gtatttactg tcgcctgcaa accctcccta agggttggtg agatggctca gtgagtgggt 3060 
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2880 
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tcttctatct tccacccttc ttcactcctt ccccaccctt tcaacctcca acactaaata 3120 

ggaaagaaaa agaatagaga ggaaaagggg gccaccttat tgctagacta cttcctgctg 3180 

attaaaggtg tccagttcct tgggtacctc tgatctttgt catcaggata tctattttcc 3240 

tgttgttgtt tcttttttgc tcaagactac tc 3272 

<210> 31 

<211> 3821 

<212> DMA. 

<213> MUB mus cuius 



<400> 31 



60 



gcttttggaa accagagact ccgagggagg cgaccaggct gcggaggaga gggccggctc 

acaaagtgct gctttgacac atccttagga tggaagttaa gtgaaaacag aaccacacaa 120 

aacaaaactc cgcgaagtgg tgctgctacg gaggaaacca aggggagaaa aacccggtgg 180 

gcaggtcaat ggttgcttcg cagcgctttg gcaagtttgt ggaacacttt ctaggaatta 240 

ggtctttttt gtacccccat catcttcttg acttccgaag aaagaagttg tgtttggatt 300 

gcaatggagt ctaaggagac agggctagac gcacgtgaat agtcccgcca gctgggctga 360 

atttgtggga atttagaaag acagcctgtg gaagtgcaac gtctctgaag tccccctggg 420 

ttcattcgga tggcacctaa cgcgtcccgt gacagacctc ttcaccaaca gcttccgatg 480 

ttgccatttt gctcttcttg accttaatta atctctagga aagtctaaac ttcggaccta 540 

cctctttttt tgatacttat tttttgtact tctgctctct gggattggtt tcttaaacaa 600 

cctggatcct ttttcatatg tcaaaatgaa tcctctgatg tttacactat tattgctctt 660 

tggatttctc tgcattcaga ttgatggatc tcgtcttcgt caagaagact ttccccccag 72 0 

gatcgtagaa cacccttctg atgtcatcgt ctccaaggga gagcccacca ctctgaactg 780 

taaagcagag ggccgaccca cccccaccat tgaatggtac aaggatggtg agagggtgga 840 
gacagacaag gatgatccca ggtcccacag aatgcttctg cccagcggat ctttattctt 
tttgcgaatt gttcatgggc gcaqaagtaa accggacgaa gggagttacg tttgtgttgc 
aaggaactat cttggtgaag cagtgagtcg aaatgcatct ctggaagtgg cattattgcg 
agatgacttc cggcaaaacc ccacagatgt ggtagtcgca gctggagagc ctgcaatctt 

ggagtgccag ccaccacggg gacacccaga accaaccatc tactggaaaa aggacaaagt 1140 

ccgaattgat gacaaggaag agagaataag tatccgtggt gggaagctga tgatctctaa 1200 



900 
960 

1020 
1080 
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tactaggaaa agcgatgctg gcatgtacac ctgtgtggga accaatatgg tgggagaaag 1260 

ggacagcgac cctgcagagc tcactgtctt tgaacgaccc acatttctca ggaggccaat 132 0 

taaccaggtg gtgctagagg aagaagctgt agaattccgt tgtcaggtcc aaggagatcc 1380 

ccagccaacg gtgaggtgga aaaaagatga tgcagacttg ccgagaggaa ggtatgatat 1440 

caaagatgac tacacgctga gaattaaaaa ggccatgagt actgatgaag gtacctatgt 1500 

gtgtattgct gagaatcggg tgggaaaagt ggaagcctct gctaccctca ctgtccgagt 1560 

tcgccctgtt gctcctccac agtttgtggt taggccaaga gatcagatcg ttgctcaagg 1620 

ccgaacagtg acattcccct gtgaaactaa aggaaaccca cagccagctg ttttttggca 1680 

gaaagaaggc agccagaacc tacttttccc gaatcaacct cagcagccca acagccgatg 1740 

ttcagtgtcg cccacggggg acctcaccat caccaacatc cagcgttcag atgcgggtta 1800 

ctacatctgc caggccctaa ccgtggcagg aagcatttta gctaaagcac agttggaagt 1860 

tactgacgtt ttgacagata gacctccacc cataatcttg caaggaccaa taaaccaaac 1920 

acttgcagta gacggtacag cattgttgaa gtgtaaagcc actggtgagc ctctgcctgt 1980 

aattagctgg ctaaaggagg gctttacttt tctggggaga gatccaagag ccacgatcca 2 040 

agaccaagga acactgcaga ttaagaattt acggatatct gatactggca cttatacttg 2100 

tgtggctaca agttccagtg gagagacttc ctggagtgca gtgctggatg taacagaatc 2160 

tggagcaaca atcagtaaaa attatgatat gaatgacctc ccgggaccac catccaaacc 2220 

tcaggtcact gatgtttcta agaacagtgt caccttatcc tggcagccag gtacacctgg 22 80 

cgttcttcct gcaagcgcgt atatcattga ggctttcagc caatcggtga gcaatagctg 234 0 

gcagacagtg gcaaaccatg ttaagacaac tctgtataca gtaagggggc tgaggccaac 2400 

acaatctact tgtttatggt , cagagcgatc aacccacaag gtctcagtga tccaagtcct 2460 

atgtcggatc ctgtacgcac acaagatatc agccccccag cacaaggagt ggaccacaga 2520 

caggtgcaga aggaattagg tgatgtcgtt gttcgtctcc ataatccagt tgtcctgaca 2580 

cctacaactg ttcaagtcac atggacggtg gaccgacaac cccagtttat tcagggctac 2640 

agagtgatgt accgtcagac ttcgggacta caagcctcaa ctgtgtggca gaatctagac 2700 

gccaaagtcc cgactgagag gagtgctgtc cttgtgaatt tgaaaaaggg ggtgacttat 2760 

gaaattaaag tccggccgta ttttaacgag ttccaaggaa tggacagtga atcgaaaaca 2820 



56/186 



wo 2005/005597 



PCT/US2003/027106 



gtccgaacca ctgaggaagc cccaagtgcc cctccccagt ctgtcactgt gctgacagtt 2 880 

ggaagtcaca acagcacaag catcagtgtt tcctgggatc ctccaccagc cgaccaccag 2940 

aatggaatta ttcaggaata taagatctgg tgtctgggaa acgaaacgcg attccatatc 3000 

aataaaacgg tggatgcagc cattcgctct gtagtaatag gtggcttgtt ccctggaatt 3060 

cagtaccggg tagaagtggc agctagcaca agtgcagggg ttggagtaaa aagtgaacca 3120 

cagccgataa taattggggg acgtaacgaa gttgtcatta ctgaaaacaa taacagcatc 3180 

actgagcaaa tcacggatgt cgtgaagcaa ccggcattta tagctggcat tggtggtgcc 3240 

tgctgggtaa ttctgatggg ttttagcatc tggttgtact ggagaagaaa gaagagaaag 3300 

ggactcagca attatgctgt aacatttcaa agaggagatg gaggactaat gagcaatggg 3360 

agccgtccca ggtcttctaa atgctggcga tcccaattac ccatggcttg ctgattcttg 3420 

gccagccacg agtttgccag tgaacaatag caatagtggc ccaaatgaaa ttggaaattt 3480 

tgggcgtgga gatgtgctgc ctccggtgcc aggccaaggg gataaaacag cgaccatgct 3540 

ctcggatgga gccatttata gcagcattga cttcactacc aaaaccactt acaacagttc 3600 

cagccaaata acacaggcca ccccatatgc cactacacaa atcctgcatt caaacagcat 3660 

ccacgaactg gcagttgatc ttcctgatcc acagtggaaa agctcagttc aacagaagac 3720 

agacctcatg ggatttggtt attcgctacc tgatcagaac aaggggaaca acggtgggaa 3780 

aggtggaaaa aagaagaaaa ctaaaaattc ttcgaaagcg c 3821 

<210> 32 

<211> 1490 

<212> DNA 

<213> MUS musculuB 

<400> 32 

tgaagaaaat gaagacggga gaaaaacgaa gctggccatc tcatatagag cagtggactt 50 

tgagtaatca gtcagttaag ataatagtta gagagttcta gaaactggtt caaaatggtt 120 

cgactatgag taggatgagt ggatactaaa tgtcccttgc tcccatccca ccatcccaat 180 

cctacctaga gcctgctgtg gagttagaac ccagaactcc attcaggtga cagctaagtc 240 

tactacgatt ggaaccttct tggttccaat gatagttctg gaaagcaaac aatgaaaaga 300 

gaatcgtgcc cagtgtttgc tgggtgtgag ggtcttcggc agtggggacc agatggtgag 360 
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gaacggaagc aggcttctgc atcgccaggg ttcaggttcc ttctcctggt ttgagttctg 420 , 

tttttttttt ttttttaaat gtgaagagat tttctttgtc atttctaaaa ctcctgtcag 480 

ttgctggatg tcctaaagct gttgaatatg gacgtaactg taaatcccag agtgttttat 540 

tttgagatga gagttttgct acagtttata caggatattt tcctatttag acctcagggc 600 

tatcttggga cccactacta agtgtgaccc ctccccgcag cttcaattct gggtagatga 660 

gttacataac ttttaaatgt gatatccagc aggaagaatg tatttctctc ttttaacttc 720 

gggcaaattt tgttttaaag gtctagaaaa aagacagtag aagaaaacag aggtaaaaga 780 

attaaaaacc aactgtaaaa taaacattct catggtatac aattgtgcta cctgaataaa 840 

cttatgtgca taaattattt aaaagtgcta tgaaacatat ggtattttcc tgtcatttgt 900 

ttgggttggt gtggttttat tcatttgatt tccacatatt tgcactttat ttttcaaagc 960 

taagggccct tctagtcgtg atccacccct tccaggaggg gagcacccct ggacaaaatg 1020 

tacctctgtt tccctgtgta tctctttatg agtggcacgc catatgcgct ttcatccagg 1080 

gcttttcttc ccccatcaat taaaatgtcc ttgagattta aaaataattg gaaatatatt 1140 

tttatattta ttgtgcgtgt gtgtgtgtgt gtgtgtgtgt gtgtgagaga gagagagaga 1200 

gagagagaga gagagagaga gagagaacat gctggtgtgt tagtgtagtg ggttaaagga 1260 

cggctgtaga agctggcccc ctccttccac cacatataca ctagggatca aactcaagtc 1320 

gtcaggctta gcagcaagcg ccttgaccac agagtcatgt caccagtcta aagatgtagt 1380 

tcaggttgac ctcaaagttg tgatcctcct gcctctgcct cttgagcata tccttcttgc 1440 

atgcaccacc acaattgact taaaatattt taaaaatcag ttttaatggc 1490 

<210> 33 

<211> 2185 

<212> DNA 

<213> Mu3 mus cuius 

<400> 33 

ggctgctgct gctgctgctg ctgctgctgg agcaaatgaa gaactctttt tcttaagcag 60 

ataaccagct tctggcagtt gcatgatctt gctattgaag tggaccttgg taaaaagtgc 120 

tggtatcact ccatatttgc ctgtcccatt cttcgtcagc aaacaacaga taacaatcca 180 

cccatgaaat tggtttgtgg tcatattata tcaagagacg ccctgaataa aatgtttaat 240 

ggtagcaaat taaaatgtcc atattgccca atggaacaga gtccaggaga tgccaaacag 300 
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atatttttct gaagaatgag tttgtttgca atttgtaagt gaaactgaat tatgggtaca 360 

ttcaagacaa gagtgttcca ttgactgcag ctatccaagg accgcctgtt cataagctat 420 

gctccagagg ctgccgattc actcgtgtgc acggaggggg tgctccagat gggaatcaca 480 

cagggctttc ttcactcttg gtcttcgttt ctgatcaagt aaacaccagc agttgtcatt 540 

cagtgcaggt ttttgtactt ctatatggtg atttttttac ttaaaagcag aaacagaagt 600 

tgaccttcct gacatgtgtt taatattcct cctgctttta cagattctga cgttttcttg 660 

ataattgtaa gcttgagagt gtttgtggaa gaacttactt tcttcttatg tatacataat 720 

taaatgaaaa gtcttcatag gagtttgaca aaatgaattg tggttataaa acgaatttgc 780 

ttttttgttg ttttgttttt cttttttggc ctaaggaaga aagctgtgat aaatttcaaa 840 

tttgcatagc ttttcaatgt tttgctctgc tcccgctctt gcttcagagt cggcacactc 900 

acctgcattt gagttctgtc tatagcccag caggctgcct gtttaaaatc ccatcatgaa 960 

tttaacaggc tgatgtgaga atgaaataga tattactggg gttttgttgt tgtttgttta 1020 

tttgttttgc ttttattacc aagaggtgct ttttaataaa tggatattga agttagggtg 10 80 

ttactaattt gatgtatggt ttcacagtct agcatactgt cctttgacat ctgcctttaa 1140 

gacttggctg agtgtctcat tagtttatca tcacagatac gcagtgttat gcatgtgtat 1200 

agaagtgtgt gcaccagcat caaacattgt gtgtgtggaa gggaagaagc ctgtccattc 1260 

taaacgcagt tgccagtctc atcacttcag gtccttacgg gcaggctcta gcaactttcc 1320 

gtgtatggac ctcgtttttt gctgttttgt gtttaattag tatattgttc atgcctctct 1380 

tctgcagtgt ctcatctcat agactgtgaa cctgtatatt attcaaatgg ctacagataa 1440 

tgctcttttc ttttgtgagg tctcttcatt taatgcactg cccagaaaga gccatgtgta 1500 

agagttgttc tctgtttgag gaactaacta catggaaaag acttctgact taaacccatg 1560 

aaatacttca tcttgagaag agtgctatgt ggaaatcacc aaatatctcg caactttatt 1620 

tcatctggtg taaatctgaa catcaacata ggaaaactgt catgagaaaa tgaaaaagca 1680 

taaacacaga agcaacgaga aatgtgactc ttgttatttt aaaccacaga cggacttggg 1740 

ttaagggaat ggggacgaca gctttggtgc taagttaatc agaaattgcg agcatgcaca 1800 

gtggtatgcc agcctgggtg atgctttcct aggagagccg gtatttgctt gtaagggaaa 1860 

gaatggtatt gtagaaaaac ccaagaaatg accacgtggt cagtttcatg gtgatggcta 1920 
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ctgtgtactg tgtaggttac ctgggtcaag actggtccag cttagagtgc atgctctgtt 1980 

catactcagc ctagctcaca gtgacttgtg cttcactgtc aggatgctct gtaaagttcc 2 040 

tcatccctgt gaagcaggga aagaacagag gtgcctctgg actgcaggag aaaggcgtcg 2100 

tgccaggcag tatcctgtgg agggaggagg gtgtacgttc gtttactatg gatagttttt 2160 

cctttgaatt aaattgtagc gtgtc 2185 

<210> 34 

<211> 3598 

<212> DNA 

<213> Mus niusculus 

<400> 34 

aaagagagga aagatttaaa atagccatat taggttatct ataatggggt caccatccac 60 



ctcagaaaat gggcagtgct cctactgagg gctatcaatg aaagacctgt gctgtgctgt 120 
gctttgcact gtctatttga aagtctcaga aggtatggtt taccttatcc cagccaccgt 
aattaagtga atgcttttag actgtgaaag gatagttgcc atttggccta agagcactta 
ggcagagcgc ttcattcccg tgtgatgctg cataccgttg tttttattta caatcccaaa 
cgctctgtgc cttggttttt cactcgccaa aaccaacctt actctactaa atgaaatgca 
atcttaccag tttaaccagt atgctgtgat attgtagtga gtctcagaga tgactggaaa 
caggaggcct gtcatttcag tgagcaccat cacctaagcg gtaatcattt ctcctgtgtc 
ttaaactgct ctgactcctg agcaagtgtt catgtctgtg tgtcaaaata aaaagttttg 
tgtgaaggac tgtctttgct ttcctccatg gtttttactg tacatttccc tgtcgtcatg 
aagtggggtg cagagactca ccttttatta aagtagctgt gtgaagtaag ctttcggtgt 
atccctaagt ctgttagcat gtactccttt gtaatatctt gagtacggtg atctttatgt 
acgttttact taccaactca caaatactgt agcaaatgaa tgtgaacatt tactttctga 
aaagccagac aattttgttt tcaattatag tactgagcaa ttaagcattt agataatctt 
ttaataacca aagctggtcc cattctggtt ttgccttttg cttacttgtt gcttcaatgt 
ttttagagca gatgtttttg gttttttggt ttttttgttt tttctaatat atgtggcttc 960 
attgttttaa gttggtgtgg gtatctaact tacaaagctt ttacattttc tttaactggg 1020 
cctcatgtgg tcccgagtag ctcttgtaaa cttagtcgga cagtaagtca ataaactctg 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



1080 
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cttgtttcct cctgagcctc tgtgtgcccg gtagccacat ctcgtacatc tgcatcaggt 1140 

tcagtggttg gtctttcccc caggttaccg tagtccagcg tcatgagttg aatgtgatcc 1200 

gtggtgatat cttctacaga cattcaggtt ttctttcctt gttacctgcc tctgttcact 1260 

tgtttaactc ttcttgtcta ttcaccccca ggctctgccc caccccaact ctattctggg 1320 

gactaatccc ctggcccgct ttaggcttgc tgggaaggtg cctagcactg agctaacccc 13 80 

tcagctcctg ggtttggctc ccttgtttca gtgtttgcta aacccttttc cacccttctc 1440 

accagtccct accttttgtt ttcaaagcct tgctccatac ttgtagctgt ttgtttttct 1500 

tcagtgtttt tccaagtccc aaaattgtta gtacttttaa ttaaattgtg tggatgctat 1560 

aaataaattt gataagtacc aacatttact taaagtcaca caatagtgaa acattgggac 1620 

aaaattcagc ccctcatctt caaatcagaa atcctctttg gtagatcctt atacgttcac 1680 

aggtccagct gtggctggcg tcagctgccc ctgattgtgg gaagcattag atcctgtcct 1740 

gaaggagtct gggtacccgc ttagtgtcct ctggtcaaaa tgtttctagc ctgtattcct 1800 

gggtaaacat tcagaatgac attgccaagc acaggctaag ttacaccact ggagtacctt 1860 

tgcaaataga aatgtcctat caaagatgac accggtgatc aaagagttgg gactaggaat 1920 

tttagccagg aattaaatct cacgctcggt gtgctaatta aataactggg gggcgggggg 1980 

cttgaggggg tgagcaagtc ctgtctgtgg aagctgacta gtaaatatgg cacttaattc 2040 

tgccaatgtt caggtcaagc aatttaaggc agttagcata ttttgaaagt agggaactgt 2100 

tgttttgttt tgagacacgg tctcacttta tagcccaggg ttggcctgga actcattttg 2160 

tagtctagtc ggtcttcaaa ctcaaggcag tcctcctgcc ttaaccttcc aagtgctggg 2220 

attatagacc taaaccgtag tgtccagata ctgctcagtt ttaatagata cactataggg 2280 

aggaatgctc caaaaaagat tcatcttgta ataacgtgag catagttcag gtcagccggc 2340 

ctggttgatc tcagctcctt cactcagggg cgagggctag ctcagttcct ctgccctggc 2400 

tgtgatagta ctgggtagag cagcagtctt caacctgtgg gtcacatgtc agatgtctgc 2460 

attatgattc gtaacagtag caacattgca gtcttgaagt agcaacaaaa taactcgtgg 2520 

tgtctgcatg aggaactgtg ttaagggtcc cggcgttagg aaggtatctg ttgaggattg 2580 

tgctttcctc tcgccctgat gttgtctttg cttccctggc tcccctttct cgcctttcct 2640 

cctcttatgt ctggcagcat tttctcactg aaggaactgt caacatgaac ctctctctct 2700 
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cactcactct cattctctct ctctctctct ctctctctct ctctctctct ctctctgtct 2760 

ctctctctct gtctctgtct tgcacacaca cacacaaaat gacaaactct tcccccccca 2820 

tccaaaaaga aatctacctg tatttcaaca atagattaat gctgaaattt tgactcatac 2880 

aaactaaggg ttttttctta aactcgtaga ttattaattt gaataacgta cagagaattt 2940 

taagtttgct taagatctct ttggataaga acacctatta aaaaatattt gagggggctg 3000 

gtgagatggc tcagagggtt agagcacccg actgttcttc caaaggtcca gagttcaaat 3060 

cccagcaacc acatggtggc tcacaaccat ccgtaacgag atctgactcc ctcttctgga 3120 

gtgtttgaag acagctacaa tgtacttaca tataataaat aaataaatct ttaaaaaaaa 3180 

atatttgagg actagagttt tgtcacaaag ataaaactcc aaccgccttc acattacttc 3240 

cttctggacg tggaagctgg gtaaacagga gagttacttc cttcttgtaa aatgtttgta 33 00 

ctggagatgt tgaaaggcca gctctgtgtt ctcaggactg taaattatct aagcattttg 3360 

atggattggg ctggcttaat ttcctccctc tagttaaaaa gaaatgcagt tgtttacatc 3420 

ttgcttgtag ctaatcttaa aagagagccc tgtttcactc aggtcttcag ggcacgtgtg 3480 

ctacagaatt ttttggaaat gtgtgacttg cgcaaagctt ggtggtagag cacttgccta 3540 

gaatgtgtga agaagtcctg tgtgtgtgtt taaaatgtac tttttaataa aacttttt 3598 

<210> 35 

<211> 4153 

<212> DNA 

<213> MUS musculus 

<400> 35 

gatcagaaat tcaaagccag cctgagctag atagtaaaag gtttgttttt ttttttttaa 60 

gttaaaaata ttttaaatta tttctgttaa ataaataaaa ttttaaaacg taaaaattca 120 

cagcccaaaa ttgtatatat ggaatggggt tgattacata cctctaattt tgcatgtaca 180 

gaacaagacc gatggggaaa aaaaaataca tctagaatta aacttcaccc agaaggatgg 240 

tgggccaatg taatagtctc ttcctctccc agtgtagttc cagtccaggg ctaaacagta 300 

tggtgtgtgc ctctgctgct cttacaggaa agcgagcagg cagaaaagat caacatcagc 360 

cttgccttct tcctgtatga cctcctgtca atcatggaca gaggcttcgt gttcaacctc 420 

atcaagcatt actgcagcca gctgtcagcc aagctgaata tccttccaac gctcatctcc 480 

atgcggctgg aattcctgag gatcctctgc agccatgagc actacctcaa cttgaacctc 540 
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ctcttcatga ataccgacac cgcaccagca tctccctgcc cctccatatc ctcccaggta 600 

gtctgctaac tacaggaaag gggcgagctg ctttgttaat tagcttagtt caatggggca 660 

tctcctatct tactaattag aggaaaatca cactattcaa accagatcag ttgatccaga 720 

tatgggctct acggttccct gaagctgcag catctaaatt gaccattttc aaatcaagta 780 

atttggcaca agggtcccta aggaggttaa ctatgtagga agaaatctgt aagcctagga 840 

aaatgaagaa acacaggtta agtctcaagt cctccactgt cgacataaac caattattgt 900 

acaatatagt gtgcaaatac aggtttagta tatttgcaaa tacaggtttg caatgtctag 960 

caagattcag aagcactgtg ttagcctagt ctctttgtcc gacagcttac aaaggaagaa 1020 

ctgcagaggg ggaggggcac gagatgaaat gattcccaat aggatttgac tgtggcaaat 1080 

gtccttactg cgtcggcacc agaggtttct caaggagtta gtgtcctcta gaaaaccctt 1140 

aaagaaatgt tattttatag ccctaatcca gatgtgggaa acaaggcaga tgttgacagc 1200 

agtaggcaag atagctcagg atggtattag attctattct gctgcatatc cttggtacta 1260 

gtacagaggg aaaggcttga cagtgataag cgccttaagt tccaggacta agaggacggg 1320 

ggtggggatc acatgggccc aggagttagt tccagatcag cctgagcatt atagtgaagc 1380 

ctcatttaaa aataataata ataataataa gtcatagtag tagctatttg tgcctgacaa 1440 

tatacctgaa gttagctcca tgaccattat aatgtgatct gggacacacg tgacttacaa 1500 

ggaccaattc caaatggact tattattgtc tgctgtcgtc tgttctgtgg gctgaatcct 1560 

ataacttccc catgctgacc ttgccctggg tcctctcccc atgctcctct tctggctgcc 162 0 

ctaaagagtt gacagaccca gggcccctac tccacagagc tccaggactt gcagggacac 1680 

ttacttctgt ccccggtgaa ggatgtcacg ctcccggagt aagaacacgg gacaggaatg 1740 

gcctagggct gccaactctg gcattgtcgt agcagacacc acttctggtt ccaggccagc 1800 

taacatccag acactgtgct cagatgacct tcagaaagga ttctggggag acagaccaca 1860 

gcgagggact gggtacagtc cttatttctc tgtccacgtg taccctgggc tgttcatctg 1920 

caacttttaa taccagacag gcaagggagg gattagctag ttttctttgt tctgtttttc 1980 

ccattttgct gttctgggga cctggtcata tcaggcacat ttcccaccac taagcttgac 2040 

ccctagccta gttttcatag tcatgttcat gttggccagt cagtgcatgc gccgtccggc 2100 

cagagctccg taaatgcaag ccagcactgt tttaattcca gctcttcata ctcataaagc 2160 



63/186 



wo 2005/005597 



PCT/US2003/027106 



tgccgttggt atttctgtag aactcgagtt cctgctccag tttccaggac caaaagattg 2220 

ccagcatgtt cgatctgacc ccggagtacc ggcagcagca cttccttaca gggctgctct 2280 

tcacggagct ggctgttgcc ctggatgctg agggggatgg gtgagtatct gacgcctaaa 2340 

atggaacctg aagggaaaga tgttaacagc attcccagtt caacttctta tgtgtacagc 2400 

aagacctcag aactgtacca cttaccagtt ccaagaagaa gcactcctgt cttaagaaag 2460 

tggtactggt ggagtataga gaggcaaggt gattagaaga tccatgaaat gggcttcttt 2 52 0 

ttacagcact ggcagttgaa ctgagagcct ctcacactgt aggccagtac tgtatcaatg 2580 

agccacattc ccagccccaa atgtctattg tttgtttgtt ttttattttc gtgacagttt 2640 

ctctgtgtag tcctggctat cctagaactc actctgtaga ccaggcttgc ctcaaactca 2700 

gaaatctgcc tgcctctgcc ttccaagtgc tagaattaat ggtgtgcacc accaccatac 2760 

ctgcctcaga aagaggtttt gttttgtttt gttttgtttt gttttgtttt gttttgtttt 2820 

gttttgtttt gtttttcgag acagggtttc tctgtatagc cctggctgtc ctggaactca 2880 

ctttgtagac caggctggcc tcgaacttag aaatctgcct gcctctgcct cccgagtgct 2940 

gggattaaag gcgtgtgcca ccacgcccgg ctcagaaaga ctttttaaag tcccttgttt 3 000 

gagcagaatt tgcagagaat gctttgctga gtgaattccc taagggcaaa cccattttgc 3060 

aggggaggct gctgaggtgg aacccagggc tttgcacaca ctaagcaggc gctcctccac 3120 

taagctatcc cgtcagccct aacaaggtca cctctgacaa agtgagcaca gagacgaaat 3180 

aaataccagc atgccactgt ccgaggctgg caaaggaagt gacggtgaag aagccagctg 3240 

tcttgggtct agatgacgct tcttatgggc aagccttctc aggcaaaggc agatgccata 3300 

ggcatgggtc tcgtgaatct tctcactgtg tccatccctg agcacctgag tgtatggcag 3360 

tgcccggact tacttgttgg ctcttggctt gtattctgtc atcttttctc tgctctaacg 3420 

acattccagt cttctcagaa ttttctgtcc tataactaac ttattttctc agcatcaaaa 3480 

agtgccctag gatatatgct gataaggtcc tgaaagaaga aaatttgcct tgcaatataa 3540 

acaacccccc attttcactt tcttaattgt ttttaataaa tagcatggct ttaagtaatt 3600 

tctgaaacta tcttttatat acacagtagc agtagttttt aattttctat ttttgtccca 3660 

gttggagaca tccctgccgt ctggttttga ttcttctaaa atcattgctg ggatattgct 372 0 

atatagcttt agctgttccg ggcttcacta tgcaaactag actggcctca aacttacaga 3780 
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gatctgcctg cctctgcctc 


cccccaccac cacccccagt 


actggagtta gaggcatgtg 




ccaccatgct 


gagctcctaa 


tatttttgaa gagcaaaagg 


ttgaaataga cattttccaa 


3900 


agaatgcgtg gtcaaaagca 


cacaaaaaag gtgtccagaa 


tctctaacag tcaggcagtg 


3960 


cttcattaaa 


acaatgatta 


tataccacat tctgcctact 


agaatagttt gatcaacaag 


4020 


acagacaggc 


cagggatgtg 


gctcagtggt agatctctag 


catgtgctaa ggtctgggtt 




ccatccccag 


cactttaaaa 


aaaaaaaaaa agaaagaaaa 


gaaaagaaaa gaaagaaaga 


4140 


aaaaagatgg 


ttg 






4153 


<210> 36 

<211> 3009 

<212> DNA 

<213> Mus musculus 








<400> 36 
atagcaacaa 


caggagcctg 


tagcaagagc aggagccaca 


tgggccctgt tgcacagagc 




tcagagaagg 


cgatgatgcc 


catgccaggg taggaagcaa 


Sgagttggaa gtttaatggc 




gcatcttgga 


gatgcggcgt 


gactcctggc cgaggtctgg 


ttcttctcct agcccaagtg 




gggacgctct 


ctctgccatc 


cttctgcctt ggctataccg 


aggatgcgca agagtctggt 




gtcctcctag 


aggactctcc 


tgttactgtc tttccccfctt 






atgataataa 


agtaaaatca 


cgtcagtcca aacgtcacag 






tcccaagtgt 


ggcctgcagg 


tcctctgccg gatcggtcgg 






tggggaggtt 


cattctgagc 


cgctagtccc gccgagactt 


tatgctagtc aggaaactgt 




gttctctgga 


aaggattatc 


ctgctggctg tcctacctct 


ctgttaagtg gcggcccgat 




gccgtggact 


ggtgtgattt 


tattgcacta ttaatcatta 


tcatgggtga agctgccgta 




gatggtgtaa 


agtggctggg 


tcccccttcg ttctgtctgc 


acagggctac aataaataag 


650 


ccagtgcgtg 


gacaggaagg 


gagtctttat cccagaatgg 


ggttcagcct ggagcaccat 


720 


cacccagctt 


cttagcggag 


ccctgggtgt ggccggatct 


cctctgggat ctctctattc 


780 


tccactgcac 


cctgggaatt 


aaggacagct atgcctctgc 


ttgagcaagc ttcccagccc 


840 


caccttttct 


ctctgcttgc 


ctccctgcct cgctccctgc 


cgggggagtg gcctcgcttg 


900 


ctctgggcac 


cccagtctct 


tcccctagca tccttggtct 


cctattcact cccctgttca 


960 
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tttttgtttc cagtgaagtg tgccaccccc agccctccca cctcccctgc ttccccgatt 1020 

ccaccagtca accccctctc gtctgaccgc ccacggggcg ccgacagcag ccataccatg 1080 

cgggtctgag ctctgactgc aagccctggc tgaggccaat gctgtgaagc tccacagagc 1140 

caccttctga tagcatccat tgcacacctg gggctctggc ttctcaccct ggcctcctgc 12 00 

ccttccaccc accatgggaa cacgtcagag agccaggctg gggaccgggg ctgcttcata 1260 

aaggaacatg gatgccttca agttcacatc tgctgccctt tccctgaagc ctggcactgt 1320 

cattttatgg ttttaaggca agacccgggc atggcaaggc caggatggcg tcctctctga 1380 

tgcccctgtc acggggagct caagtgcagt tctggatgga ttgtgtggcc ctcctgcacc 1440 

atcccccgtg gagtccatgt gctggtggga agctcatgct atgggtgagg gctagaagtg 1500 

aagacaagac agactccatc ccttggaacc cgtacaacac agcgagaggc caggtcttgc 1560 

catcaccttc ctcccattca gtcccagctg cctcagcgat gcccaaggct ttggcacggc 162 0 

tctgctgatg ggtttcccag agttcactgg aggccagcta ccctgcttga gccaaagaag 1680 

acgatgagtt ctagggagag gctcctgggc tcccagaggg gtcaagtgtg tgacagagag 1740 

acgacagcag gtctgcacag tgtctgaggg caagttggaa gcaaggagca agatggaaga 1800 

gaaaagaggc ttagagagtg aaggaagaga aggcagacgc ttttcacaag caacagggat 1860 

gtaaagaagg agggaaatgg gaagggagaa tagaaatggc ttccctagtg tggagcctta 1920 

ggtcagtgcc a,agcagaggg gctgtcacct ctgtaccttc acgtcttcct cgggagcagg 1980 

aggcgccagg aggactcatg ccaggcacat gccagctcca actgaggtgc ttggtagcaa 2040 

ggtatgaggt aaggggttgt tagagtgcta tagcctgtga gatggtccta tctgtgtcaa 2100 

ggcctgctgt ctctctccca gggtcatagg cagagagaag acggtctcat atgaagtctg 2160 

tcagccttgg ggccttacct agccagttta aacccggaaa gtactgtggg ctgactgagg 2220 

tttgccctcg gaggaggaat gaggaattaa ctgtgaggcc aagttctagg tccttccttc 22 80 

tcatctcagg catttagagc agggccagat gctttcctcc accccacctg cccagggagg 2340 

acaggacagg gagagaccct agcagagcag aatcttcctt tagcccacct accgtgcgtg 2400 

aatgtagcca gacagcagca aaggaaggct agcttcagac accaagccac cagacctggc 2460 

tctccacaca tttttgccca gagacttcag cctgaacatc agtggcccag gaaacaactg 2520 

catcagctcc catcaatcca tcaccactcc gtcatgggtc gggacagtta ctggttcata 2580 
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tgcaagtaaa gatgacaatt ctttcaacaa aaattagtga agcactctct gtgtatcagg 2640 

cactgttcta ggtgcttagg atattgtctt gcttctgagg gactctggat cgggaggatt 2700 

ccacctgttt ttgcctctct ctggcttctg gaaagtgctc cttcatgaca tcccctatct 2760 

gcatagttca agcaccccaa atctgcccct tcctccttaa agactggtag atgggatcag 2820 

ctcccaggtt accctctctc cccataccct cttccaaaag aaacaaaatg ttagggcttt 2880 

ccccgtcgtt gtctctcctt tttttaaaaa gtggtatttt ttagaaatac atgtggaata 2940 

ccaagaaatg tctttgcctc cccaccactc tcacctacat ttcataaagc tggctcttta 3000 

tgttgcttt 3009 

<210> 37 

<211> 1599 

<212> DNA 

<213> Mus mus cuius 

<400> 37 ' 

gagaagcatt tgcatgcaga gttggggatg gggcacagag gggagagacg agaacagtca 60 

gtcggtgggg tgcagtctgg aagagtgctg tctaggaaca cagaagtaat tagcaggaga 12 0 

aacagctgca ggatttaaga ttggattttc cgagaggatg aaattggttt tgaagtaaag 180 

gtggatccag cttttgtgtt tgacctttac cctgtggaat aacttgattt ttctaagctg 240 

caagctgttc gaaacctctt ttagcccatc agtggtttgt ttcattttgt tttgagatgg 300 

gctttcactg tggcgaccca tgctcttggc ctccgtggaa cagctccggc cttagcctct 360 

420 
480 



caagtgctcg gattacaaca tgtaccacac caggcccatc tgccagactt ggagtaaatc 
accaagtctt aggagccctg acacagatgc catctgccac aggcatcttc ccttctgcct 

ttgtccttcc cggctgagct ccagattgta gaagacatct aaggttccag tatgactcca 540 

tccatggcaa attcaatggc acagtcaagg ccgagaatgg gaagcttgtc atcaacagga 600 

agcccatcac catcttccag gagcgagacc cctctaacat caaatgggac gatgctggta 660 

ctgagtatgt catggagtct actggcatct tcaccaccat ggggaaggcc gggggcccac 720 

ttgaagggtg gagccaaaag ggtcatcatc tccgcccctt ctgccgatgc ccccatgttt 780 

gtgatgggtg tgaaccagga gaaatatgac atttcactca aggttgtcag cactgcatcc 84 0 

tgcaccacca actgcttagc ccccctggcc aaggtcatcc atgacaactt tggcattgtg 900 

gaagggctca tgaccatggt ccatgccatc actgccactc aggagaccgt gaatggcccc 960 
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tctggaaagc tgtggtgtga tggccatggg gctgcccaaa acatcatccc tgcatccact 1020 

ggtgctgcca aggctgtggg caaggtcatc ccagagctga atgggaagct cactggcatc 1080 

accttccatg tttctacccc caatatgtct actgtggatc tgacacgctg cctggagaaa 1140 

cctgccaagt atgatgacta gaaggcagtg aagcaggcat ctgagggccc actgaaggac 1200 

atcctgggct aaattgatga ccaggttgtc tcctgtaact tcaacagcaa ctcacactct 1260 

tccacctttg atgctggggc tggcattgct ctcaatgata actttgtaaa gctcatttcc 1320 

tggtatgaca gtgaatatgg ctacagcaac agggtgatgg acctcatggc ctatgtggcc 1380 

tccaaggagt aagaaaccct ggaccaccca ccctagcaag gacactgaga gcaagagaga 1440 

ggccctcagt tgctgaggag tccatatccc aactaggggc acccaacact gagcatctcc 1500 

ctcacagttt ccatcccaga cccccataat aacaggaggg gcctaggagc cctccctact 1560 

ctcttgaatg ccatcaataa agttcactgc aacccaccc 1599 

<210> 38 

<211> 2627 

<212> DNA 

<213> Mus musculus 

<400> 38 

gagctgggga aaaccaagta ctattattct gttaaaggaa cataggcatt gtggtgtatt 60 

caaaaataag ggccatgctc agctatcatc agagatgttc ctgcaacagt gcaagaaata 120 

aatacagagt cccacagcca ggtattgtgt agagagtgat atgcatagag tttgcaagga 180 

tctgcacctg aaaggggtcc cattgttgag agaagtagac acatgcctcc agccttaaac 240 
aagaagctat caccaatgac cacttaaaaa tgaagatttt tttaccacct gtagtctcaa 



300 



420 
480 



taaggataaa aacccctctt aaagacagcc cctatgccta gtaatagaga tggccaacac 360 
aaaatgaact caatgacatc tttggatatt atttgcatca ggtttttcca agtcaatttt 
ttaacctttt aggcactttg aatatatatt atttccaagt catttttatg ttattcttgt 

ggctacaaag gtatacaact ctgcatttat atgattttct ttggatcatt ttctatgttt 540 

tttttcctgt ttccattggt ttgatatttt tctttatctt attttatttt atatatatat 600 

tttaagattc ttgttggtct tcttacaaga aacaggcagg ggatgaatct tttttttttg 660 

ttttttttgt tttttttttt ttgttttttg agatagggtt tctctgtata gccctggcta 720 
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1020 
1080 



ccctggaact cactttgtag accaggctgc cctggaactc agaaatctgc ctgcctctgc 780 

ttcctgagtg ctgggattaa aggtgtgtgc catcacacac ggctgggatg aatcttaatg 840 

gaagttgtat aaggggaaac catactcaaa attggtttta ttttaaaaat atctatttta 900 

aatctaagaa aaataggaca gaatttctaa ttcttcgtga aattggcttg aacatatttc 960 
tgctatttta aatatattct cagctactct gattacaaaa ttattatatc attcaaccgt 
gatgtttttg aaaagctttg ggtctaccgt atcaaatatt tagctaacat ttaatttcct 

tataaaataa gaacctctgt atctttagta ttcaaatttg tttttgtctt aaacataaaa 1140 

tttattttaa aataacttta tacttactat cagtaatttt atttttatgc catttacata 1200 

ttttaaaact tgctacagta taaaattttc ttgtgtatac aggattgtaa atagaactgc 1260 

i 

ttttaattta aaaaaaaaaa gaatgtcact taagattgaa tcctgatata caaaagaaaa 132 0 

tttgaacatc ttagaaatta gcattaatgt caatcagatt ggaaactcaa aatttaatgg 13 80 

cgaatagttc ctaaactgac tgaaaatacc ttcaaaccat atccagaaaa gtgtccttta 1440 

acagtaagca tgaaacgaag actagaaaaa ataaagtcct tacaatgtat taaaatcatg 1500 

cagctaacaa tcttatttac tttaaagatc tatgaataac aacaacaaca accaaaatgt 1560 

caccatgacc tgaaatgttc agtaatagca tgtatgccta tatggtaacc aacagcgttc 1620 

ttactggact tattacccac actgtggaga tgagggagaa tcatgcctgg aatcagaaac 1680 

ctgtgagagt gtccaacaat actgaaatca aggataagaa tgcactatgg tcatttatta 1740 
aaccaacaaa atctctaact aaattctaaa tgtttgtcat tatgctgtac atagataaga 
gaagtcctca ttactcatca agaaattttt ctgtgcagca aatgaagata tttaaacaac 

caattgaaat ttagagttgt gaggccaatt tcccaatgtg tatatatcca taactcctat 1920 

acttaaggct taggaagcat tggcaaaagg acacagaaac ctggagtttg ctgtaagact 1980 

atgatttcca gaaatgtgag aatttgcagc tatggagtct caccaaggct tccaaaatgt 2040 

tgcctgaact gtgataaagg caatagacat actaacatca agcagcaaaa tctcatgacc 2100 

cctcaaatct agacaaagaa atacagatac ctaaagaata ccgagaacag gagaaatagt 2160 

ctacctcaaa gaaaagcaga ttaattggtg atccaataac gacagttcag gtctgaaaac 2220 

atacaagaaa cagtactcag aaccaggctc tatttatgta attaggagtg ttcacatgca 2280 

tgcatgtgca tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgta gtgtattttt 2340 



1800 
1860 
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tgtatatgta tgtatatgca tatttgtaga tgcatgtgtt tgtatgtgtg tatgtgctag 2400 

atacagatac ctatgtcaca gcaattaaag aaagtcatga atttgaaaag ggggggagat 2460 

ataaatgaaa gggaatgaga aagtgatgta attatattat aattttgaac tgaataatct 2520 

tctaatttca aacccgtcca tatatatgta agttcaaatt ctagcttttt caatttaaat 2580 

ctacctcaat gccagctgtc tctgtctttg tctttgtttc cttatcc 2627 

<210> 39 

<211> 1854 

<212> DNA 

<213> Mus muBCulus 



60 



<400> 39 

tcaaaactaa ctccttgagc tgaccaggaa agatgccctt gagcagcatt gcagttttta 

cagcacagac aacccagaaa cgtacatact gcctaaagtg atggtctgcc cagcatctct 120 

ctctccagcc agctctcaga gctgcccaga gcttcaacac cttgaaccct atgtgcttag 180 

gagtgacagt agtcaggata gcccagcagg cagctgggta tactttattc agggcacttt 240 

ctgtacatgt ggggttagtc tttatatgta ggagattatt aaatgccctt ttgctgcttc 30 0 

ataataattg cagatacctc atctttgttt cctactactc ttctttgtag aggggcctga 360 

tggcctctgg ccctcctcta tcataattcc atgattcctt cccctgcttc tgtctttcct 420 

ttctgcggca gcttctctca accctccctc ttcattctac acattgtagt gggggcaaat 480 

ttttacttga gagatacgta catagtatgc agtttatcct tgtacagtgt acatttgatt 540 

gtcgtctatt tacagagctg tggaaccatt atcatcatca gttttagaac atatttatca 600 

ccagaaaaat cctgttccca ctagcagtcc tccttcacct ctcctcagac ctggcaacta 660 

ctaatctttc tgtccttgtg gatttgccta taacggacct ttcataaatg ttatatgcga 720 

tgtgcaatgt gtctgtttac ttagttttaa ggttcatctg tgccatggta tatatcatca 780 

tttcaagtag tgagtacctc tctctctctc tctctctttc tctctctctc tgtgtgtgtg 840 

tgtgtgtgtg tgtgtgtgtg tgcgcgtgtg cgcttagaca ttttcattgt gtggccctgg 900 

ctgcccttag aacaggctgg tcttgagttc acagaggaag ctctgcttcc caggtgctgg 960 
gattaaaggt gtgtggcact gtgcctgcca gtgggtttgt tttgttttgt cttttctctt 
cagtttttta aattgtctaa atatttcatt acagtgcata taatgggcgt tctgtgattt 

ctagtcagga cgatcatgaa taggcttctg tatatatgtg tgtgcaacac cccttcatgt 1140 



1020 
1080 



70/186 



wo 2005/005597 



PCT/US2003/027106 



tctccttcat ggacatgtgg gggttttggt ggtggtggag gtgggttttt gtttgtttgt 12 0 0 

ttgtttttgt ttttgagaca gggtttccct gtgtaacaag ctctgtctgt cctggactca 12 60 

atttgtagac cagggcctgg gcctcgaact cacagagacc ctcctgcctc tgctagccaa 1320 

gtgctgggat taaaggggtg caccaccaca cccagcgtgt tttcattctt ttgtatagga 1380 

atagaatgga tggatcgtaa ggcagctctt acctttagtc tttggaggaa ttgccaagct 1440 

gtgttcccat ggaggatatc aggattccag tttatcaagt cttcaccaat attgtttact 1500 

aatttgaagt gtgtttcatt gtgtatgtgc tgtgttttat tttgtgggca gtgcagggga 1560 

ccagacccag gacttgcaga tgtagacaag tattgtacca ctgggctcca tccacagtcc 1620 

cttagctggt tctgagttgc agtttcctaa taactagtga tgttggccat tatgtcttcc 1680 

tcttcctcct gtgtctgtgt ctgtggtttt aagatagggt ttcatgtgtt ggacaatatt 1740 

tttgttttgt cttttttttt ttttcaaaga tagaatttct caggctgggg aaataggctt 1800 

acaaccaaaa atataagagt tcggttccta gcacccatag ctgtctctcc agcc 1854 



<210> 40 

<211> 3683 

<212> DNA 

<213> Mus mus cuius 

<400> 40 

acgattataa actgaagaca cttaattctg gtagatacgt agactcagct gataatacca 60 

atcactactg atcaaactca ggatccatgt gtgaactgta ttcatttctc tgtggtgctg 120 

ggagtggaac ccgggatctt gtattctagc aagtgctata cctttgagcc acacttgagc 180 

cctcccacac gtacagtctt aagcatgctg ctgcccacac tgacacacag agatgcacac 240 

tgacaggcag agtagcccca gcaacccaca tacagatgtg tgtatctaca cagatgactg 300 

agtctcacca atgcacattc ccaagtaact agaacctata agcttctgtg atgcttggtt 360 

cttaaactcc cccactctcc ccggcctcag tgcagacaat ggggccagct actaggcagg 420 

aaactggagt ggcaaggtag gccatggtta ttaaagaaac cagacggggt ggtggtgcac 480 

acttcaaagc ctagctctag gaagacagag gcaggtggat ctctaggcta gcctggtctg 540 

cagaatgagt tccagaacag ttagggggac ctaaacaaac cctgtatcta aaaacaagag 600 

ctatatccag ttgctgctgg cattgggcag gtgccagctc cacatacacc tccatggcag 660 
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tgggcacacc ttcctcttcc tctagggaca gcgactcaga cttgtacagt taccagcccc 720 

agagctgtct ctctgcttca taagatgtca ctcctcattt ttaggggtga gaaaatgggg 780 

aagaatatat aggctgtaga caggataagg agttgagcca aggccaggtt ccctccactt 840 

tctctggtgt gctgtttgtg cacacgtgtg agtatctaga cttgcaaaga tactgcttct 900 

caacaggagt ctgggatgta agagacaggg actgggggct ggagaagtgg ctcagggctt 960 

aagagcactt attgctcttt ccaaggatct gggctctatt cacagtacct ccagggtagc 1020 

ttaccaccat cttgtaactc caattccatg gggatctgac gccctcttct gatctgtggg 1080 

caccaggcgt gcaaaacatt taaaatataa ttaagtcttt tttaaaagta actttaaaag 1140 

tcgtttattt ttatttcatg cgtattattg gtgctttgcc tgcgtgtatg tctgggtgag 1200 

gatgtccgat ttcctggaac aggagttaca gacagctgtt agctgccatg tggttgctgg 1260 

gagttgaacc agagtcctgc aagagcagcc agtgctctta accacagagc catctctcca 1320 

ttccaagtca ttgcattttt aaagagttca aagaaagaag gtgggatggt aggtaggtac 1380 

tgagcctctg cagtggggta aagtcagatc aggactggta tttgtggagc atgcccacct 1440 

ggctcagttg ccctctgcct gactcctccg tccatgctcc atctatgctt atctttggtg 1500 

tgctgtgttt gcttctacag atggggaggt tgcctgactg tctacccatc ggcctggacc 1560 

ccagagccaa gccagctgag tgagtgccga gggtcccagg tgaccggggg agtggcactg 1620 

ctcaggcctc tgcaaagact accctggaga agctgatgct ggtggagaca gccctttctt 1680 

gctgagtcca gcctctgcag agcctgcgca ggtaactgtg agccagtgta acaaagatcg 1740 

cctcttagct taggcagaga gagagacatt aatctagcct attcgaactc cccattatcc 1800 

agagaaggga atagaggctc atgtggcaca gtgagtccct tagcatcctg cctcctagcc 1860 

tgaggactct tccctatgcc tccatgacct ggagaactgg gcggagagat gggttgtgac 1920 

ggtgaccagg attcagggca gaggtggaag ccagcgtgcc tgatatatgc agctgacctc 1980 

cccaggactc ctctacagag ctgagaggcc atggtagtcc aggtgtgatt gcatgcccgg 2 040 

cagctggccg cagggcaggc catggcagga cccatctttc ttgtggccca ggtttagccc 2100 

acccaggctc ccagccacag aggccaggtg gggctgccct gccctataga ggcaactccc 2160 

tgaactgaca catgaagcct caggctccag gaagctcctt agagtttagc ctgctggaaa 2220 

ccccacccaa cttccaaagg agcattcctg aggtctgcat gggggttggg cctccaggga 2280 
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tgggaagggt ggggaggcag ggctgcagga ggatgtgagg ctgtaaatgt gctgtgctgg 2340 

gtctgtgtgc aagcaggtgt gtggggctct ctgtccttag tgcacagggt tgtgtatgca 2400 

agagttacat aagtggccta tgtgggggga atagttaaga gcttcacatg tgtgctgtag 2460 

tggtgactgc tggctatgga tcctatgcag gtgtgggtat cctgggctgt ctaccacaca 2 520 

gggtactagt ggaccaagac tcaaacaccc ttgggtgggt ttgtgtgtgc tcagggttct 2 580 

gagcaceiagg tgaggtggca tgtgcacatg gctccctttg tgttttggag gtgggctggg 2640 

caggacttgg gctgcgctgg gctcagcaat gcccagcctt gtggccaaac tgctgaagtc 2700 

acttcctttt caacagtgac tttccagggg gaggagttct tccgacctgg tcggcagtga 2760 

ggggcgtggc cgaactggct ctctctgggg agggatgtgg gcgggggcgg gagagaggtg 2820 

ggaggaggtg ctagtggtac tggaagaggg agtcctctgg gagacagaag gaaagacaag 2880 

gacaggagtc tggaagggcc aagggcagga gggaatgggg gtggagtggg gctgaaagca 2940 

cagtccctgg gtgacctcgg agggaggaag ggagggctgc caatgaggtg accctcgggt 3000 

tcagtgagag ctggcagtgg cgcctacaca cctggcactc ggggcaaggg gctggcaggg 3060 

ggcggttcag gaacagacct gcttgccagg tgccccactc tggacaggaa gagggtgggc 3120 

gggggctgta caaaggagct ctgtgtggct gaggataggg tagggtgggg tatgcagtgc 3180 

tgtactgttc tggggttggg ggagatgatg ggggcggggc aggaccagtt ccccttgggg 3240 

catcagtggc tccaggggga cacctagtgg tcaggggagg tagtgcatct tgataacaaa 3300 

ctgggggaaa agagattaga agtggtagtt gagatagttg aggaggccag ggctagtctg 3360 

aatctttgga tgatgaagca atttgactta aaggatccca acaaaaccaa acttaggtga 3420 

caacaaagct gattggcatg gctgtgtgtc cttaagggca tgactaagcc tctctgtgtt 3480 

cacatttaaa tgcaaaacaa gtgactgggg ctggtgagat ggctcagcag gtaagagcac 3540 

ccgactgctc ttccaaaggt ctagagttca aatcccagca accacatggt ggctcacaac 3600 

catccctaac gagatctgac tccctcttct ggtgtgtctg aagacagcta cagtgtactt 3660 

acatgtaata aataaataat etc 3683 

<210> 41 

<211> 2311 

<212> DNA 

<213> Mus musculus 
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<400> 41 

aaatgtgcgt tattggggtg taattgcgat tgctagcagt tttgagtagg atggcagcat 60 

agttgtatat atccaaggct gatacagaat ttggtaggac ttgttatgtt atctatcaat 12 0 

gSSagacaag ttgctggaca tcaagctcag gggttatgcc tgctatgaaa ggctaccagc 180 

aattgagccg catcccagtc tgtctgtgtg tcttgtcttg tctttttctt ccttccttcc 240 

ttccttcctt ccttccttcc ttccttactt ccttccttcc ttccttcctt ccttccttcc 300 

tctctctctc tctctctctc tctctctctc tctctctctc tctctttttt ttttcttttg 360 

gttttttgag acagggtttt tctgtgtagc cctggctgtc ctggaactca ctttgtagaa 420 

caggctgacc tcgaactcag aaatctgcct gcctctgcct ccttagtgct gggattaaag 480 

ggcgccacca ccgcccggct atccctgtct ttcttgattg cgagatttag aacatgggct 540 

taatgaagaa gttgcttagt gagataaggt acaagttata tttttaaaat taaatttata 600 

aaactacata atacttttga aatcatttaa tgttcaaata tttttattcc ttatatctca 660 

tttaaagaag tatcacacag aaaagcttaa aattatacat gagattgcta taatagagtg 720 

tatgagctgt agcaatttca ataaaacaga tgcttgaaaa attatgtaat gactgacctt 780 

actaattgtc ataaacttca catgtcactt gtaatagggc aaaaactgcc tctattaagt 840 

gttttataga aacctgcttt attgcatttt agttacagga ataattcttt tgttgggata 900 

caacctccac tttgtcctct ggaagagtag agaccttgtt agatgttcag gataagaaga 960 

aagatacgag atatataaac tgtattgtga aagctctgtt ttagattgaa agctgcctgg 1020 

aaataggtac tacagttttg ttttttcaac ttataagctt attaaaaatt cacgaaaagg 1080 

agtgtacaag acacacatct atctaggatc cttttttttc ttcctgggac acattcaaga 1140 

gagagcagtg ggttctgaag tgccatttgg ttttgcattg cctttatttt tggtctagcc 1200 

gcactgttgc ccagacttgt cttgaactca tggactgaag ccatcactct gcctcagcct 1260 

cttaggtgca tacatggtca cctggcgctt cttaccgtct ctgctacaag agtacttgca 1320 

catataattt atattgttac agtaaggcat taatgcagtc agacctgttg ctaagggtca 1380 

gggttccaga atcttacttg tgctaaatta atgtgtgttg gttggttagc tggctggctg 1440 

gctgttgttg aatttgtatc aaccttagtg ctgaaattac aaatgtgtca tcacatcgac 1500 

ttaacctaat ataattgttt ttgcccagtt agactattca ttttccaaag tccacttaag 1560 

gactttgttg tgtatggttc taggtagctt gccacagtat ccctccctcc agattctagt 1620 
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catgagggag cttcctgctg gttgttgctg cactgcatat cagtttcttt tgacagaaga 1680 

ggtctacagc tttttacaat actttgataa tttgaactat atttgcatcc ctacatttta 1740 

caggtgtgtt tgctacagac cctttgtcag tttcagcatc aattgttttt ctaaaaagtg 1800 

tagtcttagt ttaatgaact tgaatttgta agatttatat aatgtatata tatcctttga 1860 

atctttgatt tactgtcttc tattgattat atctattctt attagtccct ctggaccctt 1920 

ttcttatcat ggccacctcc caactttatc tcctcctcaa agtagttttt ataagaattt 1980 

gtgtatgaaa aagaggcaga gaagatataa atatgccatg tgtacccaca gaggcattgg 2040 

atcttcctac ttacgggcat tgtgagccac ctgatgaggc tactgggaat agaacttgtg 2100 

ttctctggaa gagtggcaaa tgccatggaa tcttctccag ccccctttcc tttttttttt 2160 

tttcttttct cctcttgcct atggctttgt ttttgttgtg ggggagtggg gatctttatc 2220 

tttacttttg taaagtgttt ataaaaaccc atcagcattt ttcctactct tgcttccctc 2280 

tttaaaactt aataaatttt ttgagaattc c 2311 

<210> 42 

<211> 2421 

<212> DNA 

<213> Mus musculus 

<400> 42 

tctctctctc tctctctctc tctctctctc tcttttaata ttaacaggag accaccatgg 60 



acagcatcag ctatccatgt tatttacctc agtttagcca gttatttacc tcaggacaga 120 
atggaagtgg gagataaaga cagaggtgta gggaggagag ggagaggagc acctccgtga 180 
cctcttcacc cagttacatc aagttccaga accttccagg acagtgccag cagctgcatg 240 
ccaagtgttt aatacataag ccttttggga acatacttcg tatctaaatc acagtgagct 
atgtgctggc atacagtgct cataccctca tctgagccta ctcccccgta accctgcgaa 
tctaaggtct tcctgttagg ccagtgaggc agctcagtaa ttaatggtgc ctgatgccaa 
ggctgacacc ttgagtttga tccccagagt acacatggtg gagggagaga cctgactttc 
aaggtggttc tctaacttct gcatgcacat cccctcctgt cctgcccgac aagtaaataa 
agtgtgacgt aactcattag taagaaaatc aagtcccact cctcaaaatc tttttttttc 
tagagatttt atttatttat ttatttactt ttaaagattt atttaatata tatgaataca 



300 
360 
420 
480 
540 
600 
660 
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ctgtagctgt cttcagacac cccagaagag ggtatccgat ttcattacag atggttgtga 720 

gtcaccatgt ggttgctggg aattgaactt aggacctctg gaagagcagt cagtgctctt 780 

agctgctagg ccatctctcc aattctttga aagtggattc ccccccatga gtggttagca 840 

ccatttttct gtatgcagaa tcgtggcacg gggacttaag ctgcttctaa actacaagtc 900 

gtcttagctt ggccgttatc ccaatcggta cggccacacc cagggattcc atctgtcacc 960 

aagctcctga gtatcttggt gtacttgatg cttgttggaa ttgggttttc tctggtcagc 1020 

tgaatcattg gccacccatc tactaacttc acaatttgca aatctaggtt actgttttca 1080 

tcatcatttt aatttttctt tatgaatcta taccattaca caaaatgtat tattttcatc 1140 

ttgtggggga attcaggaga ggatggaatt aaaatgtgtg tccagctaat actgatttac 1200 

ttggacatct acattttatt gacagaaaaa acatgctgtc aaattgtttt attaaggcag 1260 

ttccctccat cctggactga ctgacttaga aaacctccat caataaaaga cattgtcttt 1320 

gtcataatgc ttgcagtttg aacagacagc attgaataat tatggaaata aactttgatg 1380 

ggctctgaga aggaaaaaaa gtctgatggg aaacatgaat atttacagta tagattctac 1440 

tagaatcttc caaagggcca tcctcactat tggagaaact ttttagtgta acgaccacag 1500 

cactcaggat accagctgtg aacagtggtg ccattaacca ccaagaggga gcacagtcta 1560 

gtgacaaaaa gatagaaata aagggagttg gagtcacatt tcagagttct ttggcaagat 1620 

aaagctgttg gcactaaatc aatacactca -tcatgactgt gtcatcaact ggacaacgtg 1680 

ctaggagacg gttaattgct ttattctttt cttttttgaa tgtgtttgcg tgtgtgtatg 1740 

tgtgtgtgca tgtgtgtgaa tgtgtgtatg tgtgtgtgtg tgtgcatttt gtgtgtgaat 1800 

tgtgggccac agaggagcag tggaggtcag atgacaacct ccggttggtc ctcaccttct 1860 

actgtatttg aaacaggacg tcttgtttgg tggtaccact gtatatacac atcacactaa 1920 

ctgctcatga ccttctggag catcctacct tggcttcctg tctcacctcg gctgcactgt 1980 

cgtcacagaa tatacactac cgtgtcccca gctttccatg gcttcagctc ttaatgacac 2040 

gtgtttttgc tcaccgaacc ctctccccag cgtttagcgc ttattcttgt atgaacaaac 2100 

ttgtatctaa cacctactgc atgcacagac tacagatgct agctgttaag agttatgcaa 2160 

tgacatccct catagaaacc atgcatgtcc ttgtttggtt tttctgccct taacctcttg 2220 

aagttgtgga tttgaaacca cagatgagat ggatgagctg gtgagatggc tcagcaggca 2280 
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catgtgcttg tcgtcaagcc tgagatagag tccctaaaat ctgtgtggag gaagaaggcg 2340 
tggaacttct aacagctgtc ttctgacacc cacatgtgta ctgtagaatg ccctgcccac 2400 
ccccataaac aaataaatgt t 2421 

<210> 43 

<211> 2545 

<212> DNA 

<213> Mus musculus 

<400> 43 

aagcagtctc tacatggcct ttcctacagt ctctgatcta ctctttccct ctgtatttcc 60 
tttagacagg ggccaagcct aacctctgta tatggtctct acaagttctc tctccccttt 120 
gtatttcaaa taatgtcatc actgttgggt tctgggaacc tcttgctttc ctggcatctg 
ggactttatg gttgctatcc ccatttcacc atcccacact gctatacact tctgtttaaa 
tttgtgaccc tctgtacatc tttactgttt cctcccacac ctgatcttgc ccccttttgc 
cccgacctct ctctctcctc actccctccc aagtctatct ccctcattcc atcttccagg 
atggttttgt tcccccttct aagtaggact gaagcatcca cactttgatc ttcctttttc 
ttgagcttca tttggtctat gaattgtacc acgggtattc tgagcttttt ttgtaatatc 4 80 
cacttattag tgaatacata ccatgtgtgt tcttttgtga ctgggttagc tcacacagga 540 
tgatattttg tagttccatc catttgcaga atttcatgaa gtcattgttt ttaatagctg 
agaactaatc cattgtgtaa atgtaccaag ttttctgcat ctattcctgt tgaaggacat 
ctaggttgtt tccagattct ggctattata aataaggctg ctatgaatat agtagagcat 
gtgtccttat tacatgttgg agcatctttt gaatatatgc cccagagtgg tatagctggg 
tcctcagata gcagtactat gtccaatttt ctgaggaact gacaaactga tttccagagt 
ggttgtgcaa gcttgcaatc ccaacaacaa tgaaggaatg aatgttcctt tttttccaca 900 
acctcactag catctgctgt cacttgcgtt tttgatctta gctattctga ctagtgtaag 960 
gtggaatctc agggttgttt tgatttgcat tttcctgatg actaaggatg ttgaacattt 1020 
ctttaggtga ttctatgcca ttcaagattc ctcagttgag aaatctttgt ttagctctgt 1080 
acccattttt taatagggct atttggttct ccggagtcta acttactgag ttctttgtat 1140 
atattggata ttagccttct atcagttgta gggttggtaa agatcttttc ccaatttgta 12 00 
ggttgccatt ttatcatatt ggcagtgtcg tttgccttac agaagctttg caatcttaag 1260 



180 
240 
300 
360 

420 



600 
660 

720 
780 
840 
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agatcccatt tgtctacagt tgatcttaga ccttaggaca ctagtgttct gctcaggaaa 132 0 

atttcctatg cccatgtgtt cgaggctctt tctcactttt tctcctgtta gattttgtgt 138 0 

ttctggcttt atatggaggt tcttggtcca cctggacttg aactttgtac aaggagataa 1440 

gaatggatca atttgcatta ttctacatgc agactgctac ttgagccagc accatttgtt 1500 

gaatatgctg tttttttttt ttccccccac tggatgattt taacttcttt gtcaaagatt 1560 

agatgagcat aagtgtgtgg gtttaattct gaatcttcaa ttctatttca tcgaactacc 1620 

tgcctttctc tgtgccaaga tgatgaggtt tttatgtata ttgctttgta atagagtagg 1680 

gatggtgatt tcccccataa gatctctgtt gagaatagtt ttggatatcc tgggtttttt 1740 

ttgttgttgt tgttttttca aatgaatttg agaattgctc tttttatctc tgtgaagatt 1800 

tgagttggaa tttgatgggg attgcattga atctgtagat tgcttttggt aagatggcca 1860 

tttttactat gttaatactg acgatccatg agcatgggag atctttccat cttttgaggt 1920 

cttctttgat gtcttttttg ggagactaga agttcttatc atacagatct ttcacttgct 1980 

ttgttagaat cacaccaagg tattttatat tacttgtgac ttttgtgaag ggtattcgca 2040 

attcctttct cagcccattt attttttttg agtagaggaa aactactgat ttgtttgagt 2100 

taattttaca tccagccact ttgctgaagt tgtttaacag ctgtaggagt tctctggtgg 2160 

aaattttagg gtcacttata tatactatca tatcatctgc aaatagtgat atttcgactt 2220 

cctctttttc aatttgtatc ctatttgcat agaaggctaa tatccaattg aagatctcct 2280 

tttgttgtct aattgctcta gctaaatctt caagtactat attgaataga tagaaagagg 2340 

acaaaaagaa gcaaacacac ctaagaggag tagacagcag gaaataatca aatttgggcc 240 0 

tgaaataaac caatttgaaa caaagagaac tatacacata attaagaaaa ccaggagccg 2460 

gttctttgag aaaatcaaca agatagataa atccttagcc agacttacca gagggcatag 2520 

agagagaatc taaagtaaca aaacc 2545 

<210> 44 

<211> 2435 

<212> DNA 

<213> Mus musculus 

<400> 44 

tctctctgtc ctgcttgtca ggaatcagca tgatcatgag ctgtggttag acatgatggt 60 
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ttacaaaatt ctattgaacc tggcatgaaa aaaaaaagtc tcgggaaata ttctttttaa 120 

gttttcagtt atttcagaac cttactaaaa tatactgcta cttgctttta aagtcggttt 180 

tgcgcacata cgtcttttgt tgctttcagg ttctcccatc ctctgtttgt agaaatggga 240 

cagacaccat ctacaatgta agcacaaaga attgccaaat gtcttaagtt catagatgtc 3 00 

ttcatagaag tgatactatt atatttttct ctttctctat tttcccgaag tttttttttt 3 60 

ctgaaaaagt tttttttttt aaagaagatt gggtgattga gaaaaatcct ttttgtctct 420 

ttgccttgat gctgtctttt aaatgcttat tatcattatt aaaaggtttg tgtactttct 480 

cagtgtttat ttttctctca ttgtttttgt ttgctgtttt cctcaagtac catggtatag 540 

tttccttgaa actagagggc gacagattct cagctcacaa acggaaggcc aaaatatcca 600 

ttgttagtca gccacagagg acaatcaagg tggcagagct gcctctagct gataaggtgg 660 

aatccacaac tgatttgcac tttctcagac aggtatggaa atcttccttc cttcctgttt 720 

gccttccagc aagctgttcc ctgggccaca aacacacttc attcacaaaa tggaccccaa 780 

gttgggggac agtttacttg acacagatgt tgtactgtag atgaagaccc aaggaaggaa 840 

actgctatca gcttgcgtga aatcaattct aatctgccac tcttgcctac agggtctgcg 900 

gcccttgttt ccaaagaatt gttctgtgga cttaaaggga ctctttcatt ttgaagaaag 960 

cacccacaga ttgtatcaat gtgatgggat ctcctggaaa gcctggagcc cccaaaccaa 1020 

ggtgggacat tggctatagc taagcacctt taatgagcaa tatgccattt aatgagctta 1080 

cagcgtgatt caatgtgtga ttctcaaatc gcacatgtct tttcttatta tacctagaaa 1140 

gccatgaggg aaacatggta cgaagtgact ttttaagatc caaagtataa agccaggtgg 1200 

tggtggtacc tgctttgaat tcctgcagag acaggtagat ctctgagttt gaagtcagcc 1260 

tgatctatag agtgagtttc aggacagcct gaactacaca gagtctgggg gaataaacaa 1320 

cagcaacaac aagatccaca atatagtatt agaccagtcg attttgttta gttatcagaa 1380 

tgtcagtggt ataacattga accatttctg actgaggtag tcctttcagt aagaagtcat 1440 

ttatttctta agatgagaaa ttacagcgag ggcttcctct gcttcgctga agtgagaagg 1500 

ctcacaggct tgtactatga ggccgacttc agatgatcaa gtccattccg gggtgaacac 1560 

gcacaactgt cttgcagggg ctggaggaca gatcctgccc aggtggatgg ctccttcatt 1620 

ctggctactg tcacatcttg gtcacaagac agaaaggcac ctggaccaca gctacccgag 1680 
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cttgcagaga acagtaagga acccgacatc acacgtgcct gccgttttcc ttattttcac 1740 

atggaaatgt aggcctgcat aagtcattga ataagtattg aacgtctact gtctgtccag 1800 

ctttgtctaa ggagcctaaa gggagtttgt agtttggaat ggatttgtgg ctgactttga 1860 

tacttctgtc tgctaaggtc tttgctgtta tggctatact ggtgtcacac tcttttctgt 1920 

ttccttttat ttccacgttt taaagaaaaa gtctaacaac tttcctgagt tagcagcatg 1980 

gtggcttctg atattttctc atctctcttt gttccccctt caaatgacag acatccatgc 2040 

ctattgattt taaagtacct ggtaataaaa ggccttggca tccagcctct ggcagtgcag 2100 

gaggcacgga tgcccataaa gccaggcgag taggacaaag gctcactggg taacagtcaa 2160 

taaaagatga cttgtggtca gataatgaca acaaaatcca tctgatccag aaggagtttg 2220 

aaaatgcata tgtagcagtc cttgaccaac ctttttattt gataacctgg atattatcca 2280 

aacaatgcaa ttacactaat atgcatgcat tcttgttgaa attgagactt cagtaatttc 2340 

tagttaatat tgatcatatc ctatacttta tcaaatttaa aagggttgtg ctcatcaaag 2400 

gacaccatta agaaaatgaa aggtaaacca tatcc 2435 



<210> 45 

<211> 1718 

<212> DNA 

<213> Mus musculus 




<400> 45 
aagtgaccca 


cattactaaa gaatgtcttt caagagagga aaactaaatg gccagtaaga 


60 


aaggatcttc 


acaaagactt tccgaggcat tgaggtccaa ttcttcaagg tcttcagagg 


120 


ctgccaaact 


caggctccca ccctctcctt ccatgggagt ctgtctaggc tgaaatctaa 


180 


gtccaaatga 


atgaatctgc ttggggggtg ggagcaagcc acagctgtga ctctagctca 


240 


ctgggggttt 


ctgttttata ggcggtcagg atcgagtctc tgaaattatc aacttgggac 


300 


taggaaaaca 


attgatacta tctgatttgt agaatagccc tgggtaggga aatactaatt 


360 


ttaaatatca 


ttcatctttc cttttatcca gtctctgcgc tgtaatggaa ttaggtgaaa 


420 


ggactgagtc 


caagctgtag tgagctcagg ggagcttgca caccgaactg acaaattaat 


480 


ctgtggacag 


ccctcatcca gtacttattt agtcaatagt cagtgaatga gaaaagggga 


540 


aagcacattt 


ttcctttttc tttctctatt ccttccttcc ttcttccttc cttccttcct 


600 


tacttccttc 


cttccttact tccttccttt cttccttctt tccttccttc cttcctccct 


660 
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ccctccctcc ctttttctct ctctctctct ctctctctct ctctctctct ctttctttct 72 0 

ttctttcttt ctttctttct ttctttcttt ctttcttatt tactcattag aattttttcc 780 

aatcagcata ctgtgaacct tctgtatctt ttccaaaacc atcaattttc agcttaattg 840 

tttttccccc cttctgacta aaatgtgaac tcttgaagta aacgtattta catgttaaac 900 

ttcccagact gagttaagag aattagaaaa ggactcagga gataggaaat atgttcagct 960 

ctgtggaaga gctcattatg gaacgtttcc caaacagctc tgcaatgaat gcattgtgag 102 0 

accacctata atcgatagac acatgtgtaa actgttaaag gaaatactta gcaaggtttc 1080 

agaatttgag gctgatgggg ggaacttctg ctatttgaat ttataaggag caccaggttg 1140 

cagggcacat cacactagca ctcctgtgat cctagcacat gttattatat aagtctgttt 1200 

gtgttagtgg gggtacatat gtgtgcatgt gtgcacatgc agagggcaaa tgtgaacctg 1260 

tgtgtaattc cttagggtac tgcctacctc ggtttttttt aggcacaggc tctcattggc 1320 

tcaggcctcc cttatttggt tatgtttgct agccagtgag cccgagggac ccacctgtct 13 80 

ccccagctct gggattcctt ctacctgcca acaggttggg gatttttttg tgtgagttgt 1440 

gaagatccaa ctcagggctt cgtgcttgct aacatagcaa gcatttatta gccatgttat 1500 

catctctgct tcctggggtg tgtccattca ttaacttatt caataagcat tgtgactgaa 1560 

tggtgcaatg ttctactgag aagaacttag aaccaatttt ctgtgctcga atcctgactc 1620 

taaaagatat catctgcttg aacttggcca aatcacttga cttctctgaa ccccagtttc 1680 

ttttttgtag aaggggtaat aaataaataa ggggtagg 1718 

<210> 46 

<211> 3044 

<212> DNA 

<213> Mus musculus 

<400> 46 

tctttttatt ttatttattt attttttgac aatttctctt tgcttttatg ggggagagaa 60 

agtttttcct tgcttaggag gaaaatgaca ggtctttact tggttgtttg tgctgagacc 120 

acactctaag atatatttga aaagtacctc agctgaagca gcctctgcta gttcacagaa 180 

gaacacaaag aagctacccg cacaaagctg gataagtcag taagagaaga ttccatggga 240 

tctaagaaca gaaatctgtg ttttagtgtc atcttattat ccctacctta tttcttggcc 300 
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tctctgtgct cttgacttta attcagggta agtagccctt gatccccttc ccctcttcag 360 

acacgggctt ccacattggc tacccgacgc ctgcttctgg agactcctct gttgagtacc 420 

tgcctttgca gtggttgccc caaagctgta gtgagacagt gacatggttg cctgagactg 480 

tgctgtaagc agaacatgtg cggatgaagg tttccttact ccagactgga ggagatgggt 540 

caaccatagt tcttcatata tggcggatac tgtttatatt gcagacagca gtgacagaaa 600 

aattatgatc gatgcttatt ttatttcagt ggaagacaca attccatgaa tcagaagaca 660 

acaaatgcca agaaaggacc aggttaaaaa ggttccaaga tcaaaccatc ttccaaaact 720 

caacccgtga ctccagagct gggcagactt ctattttttc catctttttc gtcattcagt 780 

gtcttcagtt taggaagcta atgttcaata cattcctcag gggaggaaat agatgaggga 840 

gagcggcttc tctcactctt tcaaaagttg attcttctgc cctgcataaa agtcatgggt 900 

ctccagcaac cctaaatatc cagtacatgt gcttatagtt cttacaatac cttagaaggg 960 

acccgaaggt tgacatcctt ctggcactct cagtagagtg tttctgatat tctctgattg 1020 

cacttttatg ttgtatttgg tttctaaatt gaccattagt tagaactata tattgtttta 1080 

tatattatat ataatatctc cacctttttg ttattttaaa ttgcatcatg ctcagttcaa 1140 

ctttctaaaa tcaaggtttc aacatattcc cagatcacat gcatatgaaa gtttcaggaa 1200 

aatattatca ataacttcag ccatgcactg cattgttctt gcccccatcc cccccccccc 1260 

gccagtaacc ttgaggattt agggttatcc tatcacctca tgtttatttc ctacatctag 1320 

aaaaatggct gtgctaaact gtataattta taaaatttct agaaattcag tcacaggggt 1380 

ctgccactgt gtatgtggca ttaaagtcat tttgtgacat ttactaattc acaaaaatgg 1440 

tctagttttt ggaccaaact ttgcaggagt ttggaatact ggagactaga aggaaataga 1500 

gggagaataa gaaattgttt ttaggaccta tgaactttat tgaattttat tcatgtttag 1560 

ggtgttttgt gactcgaatg ttttttctaa ttgtagtaga gatcaggaag attcatataa 1620 

cttacccatt taaatgagag aaattattgg aatttaaatt tcttgtgcat gttaagcaag 1680 

tggtgtttta acctatatct gatgaaaaaa gtcatttggt tcctttt^ta tgtaatgtgc 1740 

cttttacacc tgaggggttt atgtgtatgt atgtatgtgt gtatgtatgt atgtatgtat 1800 

gtatgtatgt attatgcata atatgtacat gtatgtgtag ataatagaac cataatgctg 1860 

cactcatgaa cttgcagtag ctatggttgc ctgcacaaaa cctgctcaag atcaagctag 1920 
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ccaacattcc agctccagtg gcagagggac acatgggcct cacccctagc tgagaagctg 1980 

gtgacagcaa ggtctgctag gagagggaga gtccgtcctt ttaagggtgt ggtccctggt 2040 

aggttcagaa tgatccagta gatggcccca cggctatgta tttatggaca gcactaactg 2100 

gcttcaatgg aatagatgtg tgtgtgaata aacatatata tggatgtata tgtgtgtatg 2160 

tatacacacg tatttaaaga tatgctgttg ggaaggggtt atgggatctt ggaattggtg 2220 

tgtgtatgaa ataaatacat tgtatacatg tgtgaaatta tcgaagaata aaatatttta 2280 

ataaaatgga atattgacat gatagggctt taatgtacat attttctctt tctttatttc 2340 

tcttccaagt ttgtttctaa agactgattt ttctcctggc tttgggtctc ccattctgct 2400 

tcatatatct atacagtttt tattggatgt tgagtatcaa taatatcata ttgcagattg 2460 

caatattcct atgagcattt aaattgctta cagttctatt taacgcctca gacttccatt 2520 

ggagtgagta aacctgccaa gagtgaggtt gttaagatat agctagcctg aggccccagt 2580 

gaagttaccg tgaatgacac aaaggacaca tagctctgct tggcccagac ctggatgcct 2640 

ggagacctgt aggcactctg cttcaccatt ttgcaattta tgccttggtg tcctgtaggg 2700 

gctcgtggta tcctgtgcag acttggggtt ctgtttttgg acaggccttt gtgttcacca 2760 

tcagcctctt ccaccagctt ctgtctcctc taatgtggga aacggacctg ttaccatggt 2820 

acttccttct ctgaagtctg tgatgtgtat aggggaggag ctaaggaggc aactggactt 2880 

gctgccttct caaggatccc ctcttacacc tgcaaacaac tcttttatct atttttatct 2 940 

acccttctat ttagtcacag tgaatggcta aatttggctc tgccactctt gcaaggccag 3000 

aactgaagaa cacattcact tgatattaaa actattttta aagc 3044 

<210> 47 

<211> 3100 

<212> DNA 

<213> MuB musculus 

<400> 47 

aagtttatag ctatactgtg tacatagtat atataatata tatgaaatat atatccatat 60 
gcaaaaagtc ctgcatgcct caattttctc atccctgaaa ctggaagctt cacttatcat 120 
ttacaaacag gttccaacat tcctcttttt gtgtctggtg ccagaactgg tttggaagct 180 
gttaacatgg ctgttttgct tgctgcacaa ttcggtttcc atctgtgctt atttacagac 240 
aaaattcaat gttgggagat gcttctcaag gttcaatctc agacctttta ctttcgttgg 



300 
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tttggttttg gtgccgccga acggtgtcag gtgagcaccg tgagtccgct cttctcccct 360 

tgtgttttcc cctcgtctct gcggatactg tacagcaatg gtcaactttg ccacttgcac 420 

tgagttttga gtcaaaccta ttttcttaaa tgaagttgta acttcggtat aactcaagta 480 

tattgtatat tctttgcttt tagttaaaaa aaaaaaagta aaacatttta gctaattaaa 540 

aagcactcag gtgataatta tgtaggaaaa aaaaaaaaaa acaatcttgc caaataatga 600 

acccatccta ggatgtgtag acaataatct gcttgaatat ttttgtagct cacttcctcc 660 

ccacgtttcc ccaagtaaag ctgaagtgca gatgattcag agctgacact ggatgctcaa 720 

gtcctccaca gggacagagc ggatggctcg aaggactgca gagcaaaaga gcgggagcct 780 

gcggtggtgt gttagaacgc cacaggcact ggtgaggaga cagcagggga ggaattctct 840 

tcatttaagc atttctttct ggcctctgct tagacagcgc tcagaaatgc catgtggtag 900 

ggcctgcttc tttgaaggtc acagctaacc aacccccagc tttcctgccc aggcctggcc 960 

tcagctttca gtggcagccc cctagattaa ttgagctcac caagagtagg aaagagaatg 1020 

gcagaatgga gcctgggatc cacaaggact taggctaatg catttctttc ttccttttac 1080 

ccttccaatg ccctctgtac tcttgaggtt ttgttctgca ccccccctcc cccaagtctc 1140 

ttgtctgaaa gctgcttcat cgaggcatag gacagatacc gtaggagccc tctgccctct 1200 

gcccaagccc tccacccctc acccccacct ttctcccacc ccaggtaata atctgcttcc 1260 

cttcctaaaa actgcttggt ttgcagatct gtcgagcagc ttccttggcc ccagggtatc 1320 

ctggtgcaag ccatgtttac aggaaggcat gccccagggg tcagctccct cctcccaaat 1380 

ggtctctatc tatctgcttc tgttcagcag cctggagacc actccagctg tgcaaggtta 1440 

tccagaaaag tctgatgttg ggatagggta gagggtaatg gggactagat ggatggttga 1500 

ctttgtttct tcctctgtgc cattgtttgg acaatattaa agctgcatgt aaaggggaaa 1560 

gtaatgtatg actagtaggc aaaagtgaaa ccctagtgac acgttctagt agtactaact 1620 

tctttctgta cctgtagtag tactaaaggt ttaagtatgt atgaatgcca gaagttgttg 1680 

attcatgcag tagaatttaa ggggaaattt acttctttta aaataggtcc attttttaaa 1740 

acctgcctct gggttttgag agagagagag agagagagag agagtgtgtg tgtgtgtgtg 1800 

tgtgtgtgtg tgcagatttt gatacagcta tgttggagct gccatcgttg ttacaacgtt 1860 

ggtacttcct ggtctactct aacagttccc ttcaccagag gcatgtctcc atgaaaagca 1920 
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ggtaaagcta acgtttagct ccttgcgaat catggggtac tcacagttgc ctgttatggt 1980 

acaggagtca gcagacagta tttttttttt taactctctt attgcctttt taagtgacct 2040 

cttctaaaga aacagaaggc ttgtattctg ctcaggccat catcagtgcg gagaagcctc 2100 

tctgcttgtt ttatgttttc ccagcccgtg ctgcagctgg aggtcttgcc acattccaca 2160 

gtttacccac tacgtttctg ttgccagtag ctcagaccca tggcagccac ttccagggct 2220 

gggggccagg gcgtcaggtc tcctcgtttc tctctgccct aactcagccc tcatcttcct 22 80 

cgttttcttt tcctcttcca ccccatctgt gccatggaaa ctagcatttt caaaggactc 2340 

taaaaactcc tatttctttg ttgttgtttt tttttttttt tagtttggtt ggtttttggg 2400 

gtgttttgtt ttggttttgt ctttgttttt tgagacaggg tttctctgta tagccctggc 2460 

tgtcctggaa ctcactctgt agactaggct ggcctcgaac tcagaaatcc gcctgcctct 2520 

gtctcccaag tgctgggatt aaaggcgtgc accaccaccg cccggcttct ttggtatttt 2580 

taacaagtaa ttttattcag tatccaccag gaacagcaac tggctttgtg tagttgctca 2640 

cggggcatgt ccgtgtcttt tactgttcca aaccattgct tatcaaaatt gtttctgagt 2700 

tgattcaaaa ggagacctca ctggggacca gaatctaagt tctttaagtg gaattcagac 2760 

gtccccagtc tgtccttccc tggaatccct cagtcaacca cattccctct gtagaaaaga 2820 

aaggggaaga gaagagagga gagtgtgttt taaaggaaca tttagttgtg tgtttgaagg 2880 

tcttttcttt caacaataca ggacactcct atccaaaccc aggcctcccc ctcggagcag 2940 

cctctgtcct cctgtgtctc tgaacagcct ccttcaggtt gcccaggctg tgggcaggtg 3 000 

tgtgctttgc cgagttgtgt ttgtgtctct gcgttatctg gggggtgcct tgaataaagt 3 060 

acaacttcat gacttactga ttctggaact aagcagttcc 3100 

<210> 48 

<211> 2023 

<212> DNA 

<213> Mus musculus 

<400> 48 

gaggaggagg agccgaccgg agagagatgg agcttctggc cagacttcgg aacaagaagg 60 

acacaacaac cttgtaaggt cttgtagagg cgataactgg cagcctctgc cactggcgga 120 

tagctgatat aagctagcag ataaggttag ggcaggagat tctttgccca ccgattgtgt 180 
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tctctggcca gttttaagat aataaaacag tctatgtttt tcttccttgg agaaaagctg 240 

gacagagaaa gagggaagcc acctgacagc ttggtgaagc taagcttcga ggggctagcg 300 

ggaatgatca cgatgccgag gaaggcacta agccaggagc gctggcagag ccctcgggaa 360 

ggcaatagcg tgttttataa aattacacgc aacacaactt atatgcgaaa ataaaacaca 420 

acgagctcat aagcaggtgc agggaggact gcagggtggc atgcctttct ccaaggtttt 480 

tttttttgtt ttgttttgtt ttttttaaac tcatttagac acaacagttc aggctgacct 540 

ctgttttgat ggatcctgag tgaaggctac agttggggaa caacaacaac aactactact 600 

actacttact acttctacta cttctactac tactactact actactacta ctactactac 660 

tagcagcagc aactactact acttactact tctactacta ctactagcag cagcaactac 720 

tactacttac tacttctact actactacta gcagcaacta ctactactta ctacttacta 780 

ctactactac tactactagc agcagcagca gcagcagcag gctggctaga gagatggctc 840 

agcggttaag agcacttagc tgctcttcca gaggttccga gttcaaccta cttggtggct 900 

cacaaccatc tgtaatggca tctgatgccc tcttctggtg tgtctgaaga gagcaacaat 960 

gtactcatat acattaaata aataaatctt ttaataataa taataagcct cagaatatat 1020 

gaccaacttg atcttgctct gagccaaact atttttctat gtctatagtc aagctgtttt 1080 

tatgtcagca atatgggatg gctggcaggc atgaaacaaa ggggttacag ctaagctaac 1140 

ggtttcatca atgatcatgt gccctagagc ctgggattta cctttgagaa ttcgtagata 1200 

gaaacgatgg gtttggactg ggtcttctcc taccatcctg ttagtatggt ttaaatgtcg 1260 

ggtttcttta acgttggtgg aacaggtcag aagcggcatg catagaaaca aagctccaag 1320 

ctgctgtctg aagcatcatt cagagttcct ttacccttta aggaccttca ctcagataag 1380 

actggtcccg tgctttaact caaaagaatc tcggagtgag ccaatagtga accagagccg 1440 

gctcttgtgc ttgtttccct ttgtttttta agatttcatt ttggtattgg agaaatacct 1500 

gaacagttaa gagcacaaca gaaaacccag gtcagttctc agcaccctca tccagtggct 1560 

cacaagtgtg tgtaactcca gctcctgacc tgggagaatt gagaacgcct gcctccctgt 1620 

gtggtgcaca agctcccaca tgggtggatg gtcctgcacc atctttccac atgtttacaa 1680 

tttttttatt tttttatttt tggtttttcc agacagggtt tctctgcgta gccctgattg 1740 

tcctggaact cactctgtaa accaggctgg cctcgaactc agaaatccgc ctgcctctac 1800 
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ctcccaagtg ctgggattaa aggcatgcgc caccactgcc cggcatggtt acaatttttt I860 

attttgtcgt tatagttcct tttccatttg gtaaaatcta gtcattttac cttgggacaa 1920 

aacctccccc tccccccacc ctgtttttaa aattttattc attttgtttc ttgttttaga 1980 

gtggggcttc atgtagtcca ggtttacctt atgctatgta gcc 2023 

<210> 49 

<211> 3693 

<212> DNA 

<213> Mus musculus 

<400> 49 

gctttctgga cttagggtcc agtgaggaaa gcatccaaca tctatgacaa atagtgaata 60 

agaaactgta cattaactat ggagaaagag ctgtgaggga aaggaggcct ttgacatttt 120 

agactatctc ttagcccttc tcctgcctct taattaagag gtgctttctc tttgcactgg 180 

accctggaaa tgatacagtc atccccgctt ggttccttct ggccggcaac ttctgtctgg 240 

tcttcccctc ctgcttctag aaaggaaaca agaagtcatt taaagatatt gctgagatca 300 

gggacgagtc atttcttgtc tcttttctgt ccttctgaag gtcactatga ccagctccac 360 

ctcagacctt agaagttacc cggtctgagg agaggagaca aagtcgtcta tcactcgctg 420 

gcttggaagt catgaatctt cctgccagcg agctatttct ttgcagacac ttactatgtc 480 

cccagttctg ggttagtgtt ggctctgggc aggaccaatt agggatgcag gacaggaacc 540 

ataaagcaag cttccaggaa atgaggctcc aagttcacag tcagctttat ttaagggctt 600 

gtttgcaagc acagggacag ggccgtgcca ggcaagttaa acattgtccc tggccccaag 660 
gggatggatc atggcgagtg cagacatggc ctggtgcaga gagggtgaga agacagcctg 
accactgata aggtaagcct atagacagtt gtcacaagca tgacactgct cactgttgct 

ctctaacctt ggtccagaat cacgggcatt ccatagttgg caactgggct acgtctggtt 840 

catcgcaagc tgtcatggaa ggagaatgta tgtcaggagc cactgtgccc tgccccccac 900 

cccccccgcc ccactctcag gtctttccca gggcctacct catagtttcc aaacctttca 960 

agactgcggt aatgccctct ctaccggtac attttaagcc ccgctacttt gtttgttact 102 0 

ataaaagtcg catcagacat tcccagaccc caagaggcca gacaaacgcc aactgtcctc 1080 

ctgtggtcct gctagacaca tctagaataa cccagtcagt caatggacag gggagacatc 1140 

tctgtgtagc ctggcaggtg ggggaagtat cccgtgtggg acatacctgc ctgtcctcag 1200 



720 
780 
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ttatctgaaa. 


ttccagtgta 


ggcggacatc 


ctgtactttc 


tcagcagctt 


ctgtgaaatg 






gaaaggacaa 


agactctaga 


ggattctgaa 


ggtccccagg 


aagggtcaga 




agggacccta 


ttactgtggc 


tgctaactgc 


ctgctgtcat 


gtctgctggt 


atacccaggc 




ctaggcatag 


ctcctctgtg 


cactgggatg 


ggtgtgccgg 


gccttbtcct 


cctctccacc 




actaagctct 


ccccagggat 


gggacacagt 


gagggt cc c a 


tcccaacctt 


ggcccaggac 




fccaagaggcc 


caccaggcca 


fccagaaggcc 


gfcfccfctgfctt 


ccfccactgac 


fcagcagcaaa 




atgcaacact 


cttggctcaa 


acaatacaga 


aaatgtcccc 


tccatcttaa 


t tag tec tag 




c ac ^ggc age 


tccagtgttg 


gctaattcag 


tgttagcaga 


gactgttatt 


ttgacctgta 




gcctgtgcga 


ctctttcttg 


tacccagggc 


atgtcgcctt 


atgtccttta 


ttggcttcag 




atggctgc tg 


cacgtctcaa 


tgtctgcctg 


aggacatggt 


aacaagccat 


caaggaagaa 




gagagggtgg 


cttgcccaga 


tgttcaagcc 


catgatgtca 


ttctcaggaa 


ctcccatcag 




atagctcctg 


tgtgtgtgtt 


tgagctgaag 


ttgtcacgtg 


cttgcatgta 






aggcacagca 


gtggcctctg 


actgtcactg 


tttgttgctg 


gagtggtccc 






gtgcatggag 


tggagacaca 


cacacacaca 


cacatacaca 








ttctctaacc 


actccctacc 


ttattctata 


aagacttcga 


ggtcgccgut 


^ 1^ 1^ 1^ ^ 




tttctcatta 


aaaactcaag 


cattacagag 


aaccatgact 


aaaaaaaatt 


C C clel t cl 3i 1 1 1 




aactcattct 


aaccatactt 


gggtc tatga 


caaattacct 


aattttattt 


C S-t 3. C 3.813.^^ 




taatccgatt 


ttaatttagg 


tttttgtttt 


tctgtttctt 


tctcttttcc 


ttattttggg 




cttttgagat 


gagagtttta 


agatgtatcc 


ctgcctgccc 


tggaactcac 


tatatagacc 




aggctggact 


caaacttgtg 


gccaccatct 


gagtattggg 


atttcaggcc 


tgcaccacca 




ggggcagcta 


acctcctact 


tcatcatttt 


aatgccactt 


cacacgtttc 


cctttgccat 




tttaatttaa 


ttatactagc 


aatcacatgc 


ggctttggtc 


aggttcttag 


gtgacagccg 


2520 


cagctctgtc 


tcccatggac 


agccttatag 


aagaacctcg 


cggtaaccca 


tttcacagga 


2580 


gaggctacac 


attcctcatg 


cttaacaaca 


tccagtgagc 


aatggctggt 


ggaccacaaa 


2640 


ctgatgtgtg 


catgggcatg 


tgcatgtgtg 


catgtgtgtg 


tgctcgtgag 


cacacaggct 


2700 


tgcatgcatg 


tgcatgtatt 


gtgcatgtgt 


gcacatgaag 


gtaggcatgg 


catgccttag 


2760 


tgtgcatgtg 


gaggtcagaa 


cggaggacag 


acttggggag 


ccagttctca 


ccttctacct 


2820 
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tgcttggcat ggggtctttg gccatttctg ttgctaagct ctgtacccca ggctggctga 2880 

cttgtgagct tctgggtgat tctcttgtcc acaactccca tcttgctgta gaagtgctgg 2940 

gattgcagat gcaggctgtt gcagctgggc tttttctgtg agttctgggg aatctcaggc 3000 

tgccgtgctt atttgtaaaa tgctttaccc gctgagccac ctccctgttt cgtgtctcag 3060 

ctcggacagc tgacatcatc cttcttttcc tatggatagc tgccaccatc catggtggca 3120 

gaggtcgcca tatatatata catatacata tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 3180 

tgtgtgtgtg tgtgtaggaa catattcata aaatgtatcc gttttcttct cactttatta 3240 

catcattggg tgatattccc ctgaagataa catggtgtgt cagttacggt ttctatggct 3300 

gcaagaatac accaagacca aaaagcaagt tggagaagaa agggtttctt tggcttacac 3360 

ttccatgtca ttgtcatcac cacaggaagt caggacagga actcaagaaa cacaggaacc 3420 

tggagacagg agcagatgca gaggccatgg aggggtgctg cctactggct tactccccct 3480 

ggcttgctca gcctcctttc ttacagaacc cgagaccacc atgctttctt acagaacccg 3540 

agaccaccag tgcagggatg gcaccaccca caaggggctg tgccctctcc tgttgatcgc 3600 

taattgagaa aatgctttat agctggatat catagaggta tttcctctgg agctccttcc 3660 

actctgatag ctgtagattg tgtcaggttg acc 3693 

<210> 50 

<211> 4226 

<212> DNA 

<213> Mus musculus 

<400> 50 

tagtttgatc cacatgtaag acgagaaatg gacaggggag tataggatgg ggaggaaaat 60 

aaagggaatt aatagaagat tggttggctg agagtgaaga gggagtcttc aaaatagtga 120 

ctagagtcaa ccaaccaact aaccaaccaa acaaacaaaa aactcttgag aaaatagcaa 180 

gttactgtca ttattagagt agacgctctc tctctccctc tatctctttc aaacacacac 240 

acacacacac acacacacag agagagagag agagagagag agactagagt gagagagaga 300 

gagagagaga gagagagatc tatatacccc cagtcagtct ctcaaaactg ttctcaagga 360 

aacaggtagg tgacatttct aattatccct tccatgtcct ttcgaagcat catcttccat 420 

tcataccttt tgtagctaat gactctaatc aatgggggct ggaatctctg ataagccaac 480 
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cctatgaaaa ggattcacta gagagaattt agttgtgctt ccctataggc tcaaatgtga 540 

tccccaatta tggattgcta cttctcaata accttcagga gcttcacact atagttagtt 600 

ttagggcttt cagatttaag tcacaagagg tacacacgtt cttttcagtg ttatccaggt 660 

gataatgcaa aggaagaggg gcaagcaatt aagtaggaat ttagagagtc agagggagag 720 

tttcaccacc agtgcctctc ttcageiatcc tacattccta gtcatagagc atccttttgt 780 

ttttcttaat ttccatacaa atatcagtaa gcctatttac cttgctctgc ccttggctta 840 

caggctttag gaatctcagg tcaactggag gttatttctt gaatgaacct tcagatttca 900 

ctctggggac caactccctt cctgttgaca cagctaagta cactgaacaa caaaattgaa 960 

atgtgctcta gactccatgg aaccagcgag gttaaatctt tgccaaggtt ctagtgcctc 1020 

ttggtgtact tctggagatt ggagccccaa gcaaagtttt cttggaggac agaatcaaag 1080 

aagataaagg aactaaagtc aaaacagcct ggaaatgaga atgagagatt tcccagcagg 1140 

cactgtctgt aagagccttg atatcattgc attaaagctt cttttccctg tagacaatct 1200 

ctccaccatc atttattgag ggtcaatgct tgtcttctat gtcttcagga gcttagcaga 1260 

gtctctgctc caatgtatat tcttggtctc tctgtctctc tgtctctgtc tctctgtctc 1320 

tctgtctctc tgtctctctg tctctctctc tctctctctc tctctctctc tctctctctc 1380 

tgtgtattaa tcaattgtta ggttgttcca actgtacttt aattccaatg gttgctaaat 1440 

aataatgttt cttttatttt tcatatgtga ctaacatacc agtttttctt tctttttaag 1500 

ctaaaagtgc catggaattc caccaacaaa aggggcttaa gaatttgaaa acaccatttt 1560 

tcagatatag cttaacagca tattaatatc atatgtataa gctgcttaac ctctctgtgc 1620 

ctcactttca tataaattag aattataaat actcactctt aggatgaaga aaaaagggaa 1680 

tggggatgcc ccacttcttc tttttcactc cccctaacac ccaatatgca tgaatgtgac 1740 

ctctcagagt tacttatgaa ctctgctatg tatggagcta aggccgtggc tatgacctct 1800 

cggaacagca ggtaaagtca gggctcatgt gctcccaatg agtccacatg gtatgcagct 1860 

gctttgtctg tctctggcag tcagtatcca tcacagggtt acagttactc agcgcttcag 1920 

acaggtatct tgtggcttac atgaagaagt ggctcgttcg ctaagaatgt attaattgtt 1980 

ctctctgtgg tcagaactgg gagcgtggca gttcattgaa aggggtacaa tctttgttca 2040 

agtagaacat gagaagagag agaaagagag agagggagag agagagaggg agagagaggg 2100 
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2160 



gtgagagaaa gagagagtaa gagattaatt acttcttgac agaatcaaga aggataggaa 

gagcttgcct tagaatgtga atgtaattag gaagataaag gatgaaagaa taactgggaa 2220 

gggggcaaca ttactagaat atcttttaca gacaggaggg tgaagcatca tggtatactc 2280 

tgatgagcct gtggcatgaa cgaagctgag atagcagggg cagagaatgc agtgactaag 2340 

aaaggacttg tccatccttg gagcttaaat gttattttcg gtgcagcata acactgagaa 2400 

ttgatggcat cacctgaaga cggatctgga ctcttagaaa tagctcccac ccactttctt 2460 

gagaccatcc tacacccatg acaagatcta tgtcatggtg aattacagct ttttctttgt 2520 

gtcaaccaaa ctaggttatc catgaatttt cagtgtcctt gtcccaattt cctgaatcac 2580 

aaaagagcac aggaagaatg cccctcccca tgccaaccga atccccctgt ctatggatca 2640 
gcacagcatg tcttaaagcc tgagtgagct taaagagtaa ttgctattgt ttaactttac 
caaggctaat ctatcacacc tcacccagag aacccctgat ccactctttg agcattctct 

gtcctgaggt aaaacataac aagcaaataa ataagaaggg aactgtgtgg aaaccctctt 2820 

tgtcactact gaagcatgtt catttattta gcaaaatgtc cataattttt aaattgcttg 2880 

aatcagccac tggctatttg gtatcttcaa ggggtcccca tcaggaataa tacttcccca 2940 
ttacctgcaa aaaaaaatat tgaggcaggc ttttgatcac aggaattaat tacatgcaca 
gaatttcatt gctgggagca gcaagcagct ggttcctgca gggccctggt tgaactctct 
tgccaactcc ccttctatgc ttgatcctcc ctgcacacct acacccttgc tttctttcat 
tatgctccac aggttctatt caatggggga aaattgtaat taaaacattt acaaagcttt 
ccttatgacc gcccttaagg ctgcgaacct tcacaattca atcttttttt ttttttccaa 

ataaggcaca atgacagagt ttccaggaat ttcttcctcg gggactcagg cctcctagaa 3300 

tgatattaat acattaaaaa aaaaaaaaaa acttcacaat gaagctctgg gataaaagga 3360 

gagcacgtat cttcttcaag ggaggggaga atattgtaat gatgactaat tattctcagg 3420 

agccaacagc ttccctggtt gtcagtggga tcagttaaca atggcttagc ttgtctatct 3480 

tccttatttt cctgttaatt attcctacct ctgctaccaa gagaaggggc ttgttttcct 3540 

cttagatgta attagagtaa tgaaagggtt tataatattt atatatttta tttctaggac 3600 

tctattcaat tttactttca tgcaggacag aatatagaat gcaaaacaga aacctcaagt 3660 

tcctaggttt tgtgaagtct tacagagaaa attggtttca ccataatagc aattaggagg 3720 



2700 
2760 



3000 
3060 
3120 
3180 
3240 
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gataattcct agatgaacaa cctgaaatta ctcccttaaa ggcaatactt tatataggat 3780 

tttgagaaag gtggggaaga tgaggtgaca attttggtgc attttatttg tgttgatctt 3840 

tgtattttgt gataaagaag gaaggcaaac cattggtgta ttcttctata gacctgcatt 3900 

tgcatatggt ttgtctctgt tgaatagatt ttggtttgga tgacaaatta aatccccagc 3960 

ttccaaacac ccaagttctt ttgttcagaa tttataagcc aggatgccca aatacagctc 4020 

ttcttcaaag gtaaaggggt taagcaaaca gttgctagac aattattctc ctttttatac 4080 

taacaaaacc accttctagc agctcagaac acatagcaaa tagcatttaa aaggtattat 4140 

gccccatcat cacaggcatt tccatggcaa tgaagtgatg tggcacacac aaacaaggat 4200 

gtaccagttt ttttttcaac atgctg 4226 



<210> 51 

<211> 1560 

<212> DNA 

<213> Mus musculus 

<400> 51 

gcctacactt ctgcactatc tgtgatcagg acgacagtcc aaattcaact atatattaac 60 
ttcaattact tgaggtgttt aaaagataaa agtgtactca aggttttcag catctgaaaa 120 
tatatagaga aaaaaattat agcaaactaa tacttcatgt ggaagtatta aatttaaaat 180 
ttaaattatg tacccacaca ccacagttat atctttaaca gtactaacac cagatatgca 240 

300 
360 



gcctaaagta cttctcacag acttgaactc catctacaac taacattaag aaataaaaac 
aaaaaccatc ttcataaacc actgatcata aatttctatt ttttgttctc taacttgata 

ctatatttaa ttaactgact ccttttgttt aggtatgctt acacctaaaa gatggagatt 42 0 

gttttgatac taagataaaa ctttgagaat ccttaccaaa ttttaccatt aaaaccctta 480 

gtataaaaga ttcctatgat caaagtctaa tagttctttt taagttgtat ttttaaaata 540 

ttaatgatta gatgctccac ctgctgaaga agaatatgac ctggaattcc aaaattgaca 600 

catggttcat aaacgaaagt tacttagaga aaatagcctg aaaaatgaaa aagaacagcg 660 

aatccttgtt ctaaaggaaa gagaaaacct gcatgtgtaa ataagttcca gtggaaggtt 720 

tagaatctgt cctgtgcccc atgctgttta ttaattactg cagtttaaaa caacaacaat 780 

aacaacaagg aggatgtgtg aactgcattg ctccttcatt caggtcagct tggctttctt 840 

cttgccaggg cccttatgcc ctttggcctt tcttctcaga ttgcgcctgt ctactacagg 900 
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ctgttccagg tgctcagagg 
attctgtgga aatggcgcgt 
tggttcctgg gcctcaggtg 
gggtgataaa ttctgagaga 
attaccatca gacttttcta 
tgttttcttt tttgaaactg 
gctttgatcg atcaacattt 
aaacaggtca gtgaggcatt 
gtgaagacct ctggactcct 
caatttcttc aaaattgtct 

<210> 52 

<211> 2849 

<212> DNA 

<213> Mus musculus 



<400> 52 
gccagcctca 


cgggcctgct gccactgtct gccctgtcct attgcactgc atctccgaac 


60 


ctgagccaaa 


gaaatactta agaatttggt catagcaatg agaaaagtag taactaattc 


120 


aataaccctg 


gcttgaactg tttcttttac ttcgcaccaa tgcatttatt cttatataaa 


180 


gtcttatata 


aaatctcgtg tgttctctct ctctctctct ctctctctct ctctctctct 


240 


ctctctctct 


ctctctctct ctctctctct ggttttgctt catccattca tttgtcctag 


300 


agttcctaag 


ggaatataga tctttccctc tccagatctg tcatagcact ttcttcacat 


360 


ctgtacttcg 


tgtcttctgt gtgccgtgtt ctgcacaaag gcccagccaa taagaacacc 


420 


catctccctt 


gcctgccgtt ctgtttggag ggagaacttt ccagtgcacc cagttactgt 


480 


gctcatctcc 


ttcatcactt ccttttcttg attttcatat gttagagaac aacaaattag 


540 


cccttcatgg 


tattttaagg gaaaacatgt aaaagactca caaatagatg aaaatactct 


600 


cgttaaattc 


aatggaagat tgaacatcat atgatctggt tctatgcagc cttttcttgt 


660 


ccttgagtta 


ttttacattg ctgtaaatta cattctacta gtaacaatgc aaatgactca 


720 
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ctgagctgct atccacagct ttctctgctg actgtgtaaa 960 

acgcttcaga ctgttcctct gactggctct tctgtcctgc 1020 

cttcaagttc tacctctatc agagaactct gagtagccga 1080 

ctccctcctc accctggtct tccactgagg aagtagtgtt 1140 

ctggttcttc gggaaattct ggtaccacag aattgggaaa 1200 

tatttgatac agtggcaaca ttgtctgagc tccccaaaag 1260 

gggacagatt ttcttcagta tgtcctttat catcagatgg 132 0 

ctgagtcagt atcaggtgag gtctgggaaa agcacacaag 1380 

ttacaatgca gttctaaagc gattcattct caatgtcact 1440 

catttttgtt cagcatgaag acattatttt aaatgcaact 1500 

catgttgact gtaatacaat aaaagaaatc tttatgcccg 1560 
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tccatacaaa 


tgcaaattat 


acctaatttg 


agattcgact 


gattattgca 


tgtgcctttt 


780 


catgtgagca 


tgtgcatgtt 


gaggccatgg 


gacagtctct 


agtcttacag 


ctcagtagct 


840 


gtctgtctta 


tttcttgaga 


cagtgtctct 


ttctggcttg 


gaactcagca 


agtagggcta 


900 


aactggtagc 


caaataatcc 


cagagatcag 


cctaactccg 


cctctccagg 


gcttggacta 


960 


aaatgtgttt 


cacagtggtt 


ggtatttaaa 


gaatgatttc 


taatgtgggt 


tctgggaatt 


1020 


taccctatag 


tcacacattt 


actattggcc 


cttggattac 


cacagtttct 


ttctttctac 


1080 


ccaatgctcc 


cccttttgcc 


tcttgcttct 


actaccagta 


tggggaacac 


aaactatctg 


1140 


tgtcgcctcc 


aaactgctca 


tgtgtctggt 


ggcataaatt 


tgtttggaca 


atgtgggaat 


1200 


aaagtagaca 


acaacaacaa 


caaaaaaaag 


gaataaaagt 


ggacaggttg 


aagacaaagt 


1260 


ttacaaggaa 


acctgacaaa 


gttggcctat 


gggttgcatt 


tggatcaaga 


atcatgagaa 


1320 


tcatgataac 


ttctgtgtct 


ggcgtgaaag 


actgggtctc 


cagggaacat 


ggatgtagga 


1380 


agaagtgttg 


tgtgatgagc 


cgggggaaaa 


gtccttaaaa 


tattgttttt 


aaaaataatt 


1440 


gattacatac 


aggcatatag 


aaagttaaaa 


gcgaacatcc 


ccaggacatg 


gtaaagattc 


1500 


acgtataaaa 


ttctgcctca 


agttaccacc 


cttttcccac 


ccaggacaga 


agttcacaaa 


1560 


ggacatccac 


tctgttccca 


gcttgcagcc 


attctttgag 


gcacaattgc 


agtttttccc 


1620 


actagtgagt 


ctgtgttctt 


ttgtttatga 


actggaaaga 


aagacatgcc 


ctgaatctcc 


1680 


ctcatagact 


aagaaaaatg 


ctttctatgg 


agtatttggt 


ttccaatggc 


tggaaattta 


1740 


caggtcactg 


cttcttaaaa 


ggcacactgt 


cccttggaga 


aggtgaatat 


gactatcccc 


1800 


ccattgtctc 


tatggtagaa 


aacatgtaac 


cattgattgg 


gtcagagtgt 


ggcaactgaa 


1860 


atctggaaaa 


ggaaatttta 


aatgactatc 


ctgacttcac 


agcctcttgg 


ccaaatggca 


1920 


gtagcattat 


cattgagcag 


gttaaggatg 


ggcaatggtg 


gttaattgag 


ttatagtcct 


1980 


tgtagatgaa 


atgtccaaag 


tgcgcgaggc 


ctcaggaaag 


tctaatgctg 


ttagatgcat 


2040 


cctgtataat 


agaatacctg 


caaatgtctg 


taaacaaatt 


aggattgctt 


tttgagtggg 


2100 


aatgtatagc 


tctgcaggga 


catgcgaaat 


gtatctggta 


atgtctgtaa 


agcacttgaa 


2160 


atgttttttt 


ttcttttagc 


cacttttcct 


cagggatact 


ttgtctcccg 


tacttctaat 


2220 


tccaatgata 


aagaataatt 


gtcaattcta 


cacttagtgg 


gttgtgcaga 


gtgactcacg 


2280 


aagctattta 


caacacatcc 


agaactcacg 


attacatttc 


tcatcatgaa 


gatctgcctt 


2340 
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tctaacttgt 


gacatttaga taccactttc agcttgcagg agtggcagct ataacttcgt 


2400 


ggtccttagg 


gaaaaacttt actaccggag tagaagaatg tagtaaaaag gagcgtaggt 


2460 


ttggaaagaa 


catttggtac ctgacacctg gctctcacat ttatgagcta cacaatctgg 


2520 


caaatcactt 


tacctcctgg actccagctc tcttacctgt aaaatgaaga caatggcacc 


2580 


tgccactcaa 


ggctgtttgt gtgtctcggt gagtaatgta tagttcttga cagtttaatg 


2640 


atgcccaatg 


aaacctcctt tttccttcta aattcattgc aattagtcaa ctcaagagca 


2700 


attattctta 


aatcttccct ctgtctatgc aggtcaattc agaggttaac tgatataatt 


2760 


ttttttaaaa 


tagccaagtc caacatacat acatgcactt agattgttaa atagttcttg 


2820 


actgaacctt 


aaaaaaacaa aacaaaatg 


2849 


<210> 53 

<211> 3551 

<212> DNA 

<213> Mus musculus 




<400> 53 
tattcactga 


agaaacaatt gactttcgct tctctgttgt caaagtggcc cctggtgata 


60 


gtgctgcagt 


ggctagatag agcgagcaga atggcttcct cctggctggt gggctggcag 


120 


cttcactggc 


actgccaaat gtacttccta tttgttgtgc aagggaattg gaacagcgag 


180 


gcatttatca 


tatcatcccc tactcctcaa tgcgagcaaa aaaggagaag ttgtcaatga 


240 


aagaaaaaga 


actgtaatcg cacatttaca tatgcttcta attgttgatt tggggatttt 


300 


ctatgaatat 


agctttacaa aacagatgct gtttaagaaa agggggaaca taattttgtg 


350 


ggcaatgaat 


taagtgtttt tgtggccctc tcatccgtag ctaggagcag tttgtggacc 


420 


gcgtctgtga 


acgcggctca taattgtttt tcacacataa gttatgcaaa tgagctttta 


480 


tggcaactgg 


cataacaatt agcatcctcc agcaatattt tagcaggtta attgcaaaat 


540 


ttctaaattg 


tacatctgac ttgttaatta ggcatgacag aggtggtaaa atagttatct 


600 


tcaggcagtg 


gcagccagga gctgcttgaa atgcaaagag caaggattga ttggatttga 


660 


gggctgcaat 


tgtgggagca gggctgctgt caagtgccgc ctagcagctc tgctccagcc 


720 


gctgcctcag 


agcaagacca ggacttgctg caaggatcct gccacttaca agcctgcttt 


780 


atttaactca 


acaacagtcc cattcccacc tatctgaact gtttatgttg tacagtttgc 


840 


tggccatcgg 


gatcattgaa atgaggtagc aacacaaaag aagttctttt gggcttgagt 


900 
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tttaggaccc tggaagttta ctttctattt taaagtctgt gtcagcactt gccacttaaa 960 

aaaaaaacaa tgtttagata ggtaaacaca attcccaatt tttattgaat gtaaatttta 1020 

gttatccagg catcaagtgt gatcattttc tttgtgataa tttaactttt caacatcatt 1080 

tcctttcatg attggcatgc ttggggaata ctattttgtt atttatttat ttttaacaaa 1140 

gtaagagaga aagagagtga atgtcagaac tacgtaaagt acattggttt actttgggaa 1200 

atctataaaa tgataatatt gaacatagca tattattctt agattttata ttagtaaaga 1260 

attcttctgt agtatggttt aagtaatttt taaatgactt aaaaaatttc tatggacggt 1320 

ttcatagtct gaatggtata atttgctaag cataaacatt agtgaataga gcatgatgac 1380 

acgtgtgtat ctgcttgtgt cacgatgaac catgctcttt gtatttaaat tagctaatta 1440 

catttttctg tgcatatagt acacacaagt acaccacata tgaagtgaag catgtatgaa 1500 

gtctgatgtc atctttgaat tcacatatat tttcttgact aatccagatt atgtttaatg 1560 

ttgaaatgtt atattttcat ttaaaaagta aatgttctgc atgtgcagct gtgctaatat 1620 

tttacatttt cttgtgacat gtgtatatgg agaaagtgcc atttatgata tgttgtcatc 1680 

aataattttg tcattgagaa taaaaggctt aaattcatcc acccagtgga acattttgtc 1740 

atatttattt taagagatga aattagtgaa cagttgggct ctttatatgt aggagatgtg 1800 

aaaatagcag aacatttacc atgaatggag tgcaacatta gcatctcagt gccactaaat 1860 

tatacagtag tagtggtagt gtattctata gacatctaat aaatagtttc ttgttttccc 1920 

actttcttta tatgtgttct ttatatggcc tagtttcata ataaaggtga tattaattat 1980 

tggttaactt ttttagagtg atctatcaca tttgaaacta ttgtatttcg aatgagaata 2040 

aaacgttgtt ccaaatcatt acatttactg attaaatgct caccgatttt tatccattgg 2100 

tgtatttggc aatttaagta agagtttacc acagtaatgc ttcagtggat aatatttgga 2160 

atagagtaac cattttaatg agttcatcga gggagattca gcagggagca tttcaggtgt 2220 

attacggctt ttgtcttgtc aggatgcaca atctccatac cattagagaa aaggcttcag 2280 

agtcccactc atctccgtta ataatgatac taacaacaac aacaacagca acaacagcag 2340 

cagcagcagc agcagcagca gcagcagcag cagcagcaaa gacaaaataa taatctagag 24 00 

cttctccttt ccagtaaagt ctgggcagca agatagaaag cacaggcagg tcgagtgttt 2460 

ttaggaaact tgttaagcga ataccatttc tgtgggttaa atttccatca catttttaac 2520 



96/186 



wo 2005/005597 



PCT/US2003/027106 



tgtcctaata ttgatcaccc tacagaggaa tgagactgaa gcctggtagt tattagtata 25 80 

aagagggtca gctgctgaga cggtccagag cagaggctct catggtaatg atgctgctat 2640 

ttatagatcc ccatttcatt agttgcagag tttcaaggaa gagattttct ctaggggaaa 2700 

tggatacttg aagttcattt tcttcctcac attaaggcag gaacgtgaac aaccttcagt 2760 

ataggatgtg cgattatggt atttttgcag gggcagttta ttccctacat gtatttgcca 2820 

cagtaaatgt acatttaaaa cataatgtag ggactcagaa atgccagctg ctgttttggc 2880 

cgaatagtac attatgtacg ttgctcttga tatccttgtc attttttttt cttgtaaaaa 2940 

ttaaatttca aaaattgtcc aaagctgagt ataatcatgg tcttctcttt cttccgagtg 3000 

ctttagagcc taagaaggat tgtgagaagt gccagtcccc ccaggtccag tctgtctaca 3060 

gtgtgttatc tgtctatatt tgtgatatag gtaattgtgc tttctttctg gaattcttga 3120 

catttgagtt atttttttcc ctttaagaga atatttactt agctagtatt cacttaatta 3180 

gaactgactg tttaatgttt tctgggcggg tatttatggt attttctttg ctatatttgc 3240 

attccagaaa ttaagtcccc ctgccattat tcggcaagcc tttcatacat tagaatgatg 3300 

aattgaaagc agaaatggga aaaagactgc aatgcaatga aaatttaatc agcgtcttct 3360 

gctgctttaa taaggcaaat aattcttatt ggccgctgtg ttaaggtttc taatatttaa 3420 

ttcataacaa accttgcatt attctgcagt tgcatcgaca gctccacttt gctgcctgcc 3480 

aacaggcaac cataaaaact taaaagcaga tgtaaatgtc taaaacaagg agaatgatta 3540 

gatctaaagc g 3551 

<210> 54 

<211> 2244 

<212> DNA 

<213> Mus mus cuius 

<400> 54 

gcaatggagg tggtgttcag acaaagagag tgggtttcat gttcaaggaa gacattctat 60 

aagaagtgat ctcagaccat gggctagaga agtggacaga gaattgaacc aggtagtcct 120 

tcagagcgag gctaagagct ttagagtcat ccctagcaca gggaatctgc tgagagagtc 180 

actagatatc tgcagcctgt tctaaggtca agattcttgt caacttcttc tctgaggtgt 240 

ttgctgttag aagctctgct tctagaagct cggttcctag agcagagatg gtcataggtg 300 
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gcgaactcca aagaggtgat ggctagaagg gacaaaggtt ggtctcatga gactaaagtc 360 

ctagttttga tgtgatctac ctcttttatg tatgtgcttt ccactgtaat tccatgcact 420 

agagctaaac tgatgatgaa accatgcttt tgaacatcta gaaatgtgat ccaaagaaac 480 

tagagggaca gactgtgttc caatggtcag atgcaactct ctcactaact gataggaggc 540 

actgttcttg ttatgttggg ttagtgttta ctgtaagtaa tgttcctcta gcaaacgcta 600 

aactactttt aattttaata agtcaaagat caggcaattt aatatgtata tagaatttgt 660 

aaattaattg tataaaataa tttacatata cattttataa cattatagca catacattta 720 

tataaacaat atagggtaag catttaagct tatttgttga ggtatccaag cctcaacaaa 780 
gtgtggggta agaaacacca atagagatgg ctcagcactt tgtcctgtgc ctctctctgt 840 
ggctcactct tgtctcctgt cacacactct cttcctgaag atggcagccc aagttcacag 900 
ggcgcatgga gccctgtgct tgctatagag ccaagaatga ccatgaagtc ctgaccctcc 960 

tgccttcaac ttccaggatt ataggtacac accacaacgt ctggcttctg aagttctgga 1020 

aatcaaatcc agggctctgt gcatgctagg caagcactct ggcaaataag ctttgtctct 1080 

ctctgaagag agactctctt ttttctttca gtacatttgt aattgaaaat agacaccact 1140 

ctcctaactt ccttcttcca accccttcta tgcaacccca ttctccagcc agttggtagc 1200 

ctcttttcat tactattatt ttctatctat ctatctatct atctatctat ctatctatct 1260 

atctatctaa tctacatatc atctatctat tatctatcta atatatctat ctatttaatc 1320 

tataatctat catctatcat tctaattatc atctagctat ctaatctatc tatcatttat 1380 

ctttgtttct atctatcatt tatctttgtt tctatctgtc tatcaatcat ttatctatgt 1440 

atgtatgtat ctatccatcc atctatctaa tctattatct atctatctat ctatctatct 1500 

atctatctat ctatctatct atctatctat ctatctacct acctatctat ctatctatct 1560 

atctatctat ctatctatct atctaatcta ttatctatgt atcttcatgt atacaaagat 1620 

atataaatac agtgtgatga gtgagctcat tttcttgttt gtatgtatat cctttcaggg 1680 

atgaccactt tgcagtggac aatgataagg tagcttatct ttgagaggtc aactctactc 1740 

ccagcagtca tttgttgtct acagttcttt gactaggggc agggctgtcc aaaatttgaa 1800 

tggtagatga tatttacata gatgtggtaa cttcatccaa ttgtatacaa atgttgtatt 1860 

ctgattaaat ggaagaatta tcaaatagat gaaaaccccc atcttttaaa taaagtcctt 1920 
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ccatccaatg aagtaaatca tgaagcaata aatgtagaaa catgagataa atctaaaaaa 1980 

tatatggctt caacctgagc ccagtgcatc ttgacccagc ctgcccactg gcatccccta 2040 

gggaggtcat gtgacctaca catgctgtgc taatctcaca tctgctgtaa cagtaaaacc 2100 

tctctccacc cttcaatccg cctcctctgt gattcccata ttctcctgtc cacactgttc 2160 

cacccttctg ctgcaccacc caatggccca aaggccttat ttaattcagc taagcatttc 2220 

ccatttttca agaacgagct tgct 2244 

<210> 55 

<211> 1511 

<212> DNA 

<213> Mus musculus 

<400> 55 

gggtcactgc agtgtcttct cccttcaaaa gatccaactg atgctgaggt agaaattgaa 60 

tgtcggcgcc ccctaatcct cctctgcgga ctcttgggag gatgtctctt cagtctcctg 120 

agcaggcttc ttctgaacga cttgttgtgg ccatttccta ggtctgcctc ttggccgttt 180 

tttctccaat ggtctctgct ttcttctggg ctgctttaga gggactcttg tttttgctgc 240 

ctttgggtct tcctctgggt ctcttaggag agggctcaca ggagcttagg aggttggctc 300 

ttgctgctgc ttcctgggtc ggccgcgtcc tcgcttctgt ggcacctggg cggcaggttg 360 

cccctgggct gatgtggaag gctgcccggc gccctcaccg cgtgcgctca tcctgcctcc 420 

cgccgccgct accactgcct ccgccggtac cgccactgca gccgctcggc ctccaccgcc 480 

ccggatccgc ccagcacctt tcggtagacg ggatggagag agctggagag ggcaagagca 54 0 

gcgagagcag gcgagctggc gtgtgcgcct gggactgctg ctgcttaggc tgccgccgct 600 

gccaccatcc tgcatcactg ttaagggaag tggaaacttg agggttcttt ggaaagtcgg 660 

tgggatggtg ttttgctggg gcaaacacgt gaaggaatgt tgccgttgcg ttaaagtgga 720 

cacaggtgtg aaaggctaag gcagactcct gaagaaacgt tttgttgaag ctgacacagg 780 

agagaggatg ttctgctaaa gcaagaaagg atacctgatg aaggattctt cgataacaac 840 

atgcatgtac tggtcagctt tacattgtgt agttgagctg tattttgcca ggacgccata 900 

gagagaaatg caccaaaaaa cttctggtgg tgtgctgcag cttcttgatg cttccaagga 960 

ctcgggctga ttggcagagt gatgccagct gagacaggcg attgtgctga ggcaaggcat 1020 

gtggaggaca cgtgatctat ggagggacta aatagaactt gacggacagt ggcagaggct 1080 
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gagcttggct tgtttagaga gctagctgtg caacagtttg tcagtctcgc agcattgctg 1140 

atcactgctg agaggagagg cacaaccagg aacttctcag ggcattcctc tgggtccttc 1200 

ctgctgactc aagctgaagc tatggccagg ttatctctgc taggtcatgt ctccactgtt 1260 

gattcgtgtt tgctatccca actctactga actgcactgc tgtgtatctg tgaaaagttt 1320 

gttgccgcca tctgctaacc tgtaaactga actgcagatt tccagacaac acagacagga 1380 

gttgctccaa agaagttttc taaacaggtc cacttccccc atatcctttc tttcccacct 1440 

tctgttgggt ggtgggctaa aagagggtta actcatttaa gaatttgaaa aattaaagtt 1500 

acaattaaca g 1511 

<210> 56 

<211> 1219 

<212> DNA 

<213> Mus musculus 

<400> 56 

gccgaccccc gagaccgaag attggggaaa aaaaaagcaa aacaaattta aaaaaaaaat 60 
aaaaaaagaa agaaaggaaa agaaaagaaa ccacgtcaaa agttagcagt gaaacccgcc 120 
ctccgtttct tgtacagtct ggaggatttt cgctacattt tgacaactct gaaacgtgtt 
aactcttagt gccatcaaga atcccatttg ggagtatttt tgatttttct actttttgtt 
gacaaaaggg atttgtactc tgtgcattgg atggacctgt ttggtacttg ggattttcct 
ctttgagtca acatcagtgt tgtaaattcg ataaacggat tcacttttag cagcagactt 
tgaactgcag cttcgcccgg ctgatgctgg gcgggcgccg aggacacccg actgagctca 420 
cgtacctgcg ctgcagacca agccttcgcc cgagttcaag actccagtgg actacctttt 480 
tgcacagcgc tgcatgttga taccactgcc tttactcact ttggtttgtt tggtttttgt 540 
ttttcgcttc cagttgggat gggggaaggc ctttgtgtgt gtattggggg gaggggttaa 600 
aaataattat cccaaatttt ttaatgtatt gctttttttt ttttctttta ctgttttttt 660 
tttctttctt tctttttttt ttttggcttc tttaccattt taagttctga cctcaggcct 720 
ccatttgggc ccagggcctc ttggaggctt cgcgttttct gtaccttgtg atgaatgtta 780 
ataggcgttt ttattataca gagctgaatg tcatttctcg tctgtagttt tctggcactc 840 
attccatttt cctatagaca tcaccatgtt tctctcaagt ttaaaaccaa accaaaccaa 



180 
240 
300 
360 



900 
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acattccgtt ttggtctttt 


tccaaaaaaa 


aaaaaagaga aaagaaaaga 


aaaaaaaaga 


960 


tcccaaaggc 


tgcaccttac 


acctgaaggt 


ccccacacgg ggacatgaca 


tcctgccagt 


1020 


aagagaatga 


atgacagaaa 


aggaaagaga 


gaaacgcacg cgcgcacact 


ctagggacat 


1080 


ggcagattca 


tttcacaaaa 


acagtattgg 


gggtgggatg ggggactgtt 


ggaggttttt 


1140 


cttttttttc 


tttttcatag 


cacccaccat 


tgtgagtaac tgccaccaca 


ttctctcagc 


1200 


atcaggaaac 


acagccacg 








1219 



<210> 57 

<211> 4491 

<212> DNA 

<213> Mus musculus 

<220> 

<221> modif ied_base 
<222> (3755) .. (3755) 
<223> a, c, t, g, unknown or other 

<400> 57 



agccatctgc 


gaagttcctg 


gaggagagga 


agaggtgaag agttaggagt tgggcactgg 


60 


ggagttgctg 


tagtcagtag 


agcacttgct 


atgcaaaata aggatctgag ctcggatccc 


120 


taaagcccat 


gggaaagccg 


agaatgtctg 


tgtgcatctg taacctcaga cctgggaggc 


180 


agagacaggc 


agatgcctga 


gttcgctggc 


cagtcagcct ggctgaatca atgagttcta 


240 


ggttcactga 


gagaccctgt 


ctcagaataa 


agtaaaggtg tgagtgatag aggaagacat 


300 


ccgatgtcaa 


cctctggctt 


tgacaggagt 


gcacactcac actcctcatg cacaagttca 


360 


tgcacatgtg 


aacacgtagc 


acacatacac 


accagcatgc acaagtacac acacatatgc 


420 


acacacctgc 


aaggacaccc 


aggatcacaa 


gtacatacac acaaacagac acacacatgc 


480 


acacaacacc 


cctacctgca 


ggtacatcca 


cgtgtacaaa cacacacaca cacacacaca 


540 


cacacacaag 


aactcctcaa 


actcagaaat 


ccacctgcct ctgcctccca agtgctggga 


600 


ttaaaggcat 


gtgccaccat 


gcttagcaaa 


agttgaagaa tataaaacca taataaatag 


660 


aaatgattga 


gaaaataaaa 


tcaacattga 


atatcatttt attaaaataa acaaatcaga 


720 


agccattaaa 


gacaaatctt 


taaagataag 


atcacagtta ttttcctgtg gctatgttca 


780 


ggatgggctt 


tctggagcag 


gtggaccaga 


agtgatgggc aaacttcaga tgggttgagg 


840 


aaggagggga 


gaggggtgat 


tgaagtcagg 


agaacagggg acccacaaaa cagcaagctt 


900 
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gtctgtaaag cagagagagc attgggatga cagagtgagc atttggccaa attaaccatt 960 

ttctcaaacc atagttgcac aattaatttt atctcagttg acggtgactc cgtccttttt 1020 

gttgctcagg ctgtgaaacc ttagtcagtc ttggcttccc cacctttcat ctgccaggct 1080 

ctttttgttc gtggtaagct cttttgccct ctgagtcatc ccgctgccac ctagaccgtt 1140 

ggaaaatctt gtatctcctt ttagaccata tctggagcct gactgcctct cagcccctcc 1200 

actgctaatg gctggcacaa gccaccacaa cttctggtct gagttgctgc acagtcatcc 1260 

tcactggact gctaatggcc tctctgcctt caacagcagc acaccagcca gatggagcct 1320 

catcgagtgt acagcagatc atgtcaccct gttaaaaccc tgttgtgact gcttaaatga 1380 

ttgcagagta aaggtcagcg catctgctgt ggcctcgggg tgcggacacc cctgcacctg 1440 

ttcctctctg acctcatctc cagtactctc ccttacacag ctccagccac actggcctca 1500 

cgcagcaccc tcctgtttca aggtctcagc tgtgctctgc caagactgtt cttgtcccag 1560 

agagctgtgg ggtaatattt gctcacatgc cgtcttttta ttgaagcctg cctttgtctt 1620 

gacgctgaaa attgtgcctg ttcctactta ccacttctac tgcctcagcc accttctcct 1680 

actgggacga taacctctac ttcccatgga aggctggtca atgctcctca ctgctacatc 1740 

agacttcctt cctgccttac agagaattca acatcctggc ttgggactca cacaggtcca 1800 

gaaatagacc tgcttcttgt cagtgacttt catggcaaat gttgaacaca gtgtctgctg 1860 

tggtctgaat ggaagaggac atttctctag tgggattttc attagataaa ctcatgtttt 1920 

gtcacttaag gcttatacaa atttgactct ttaggatatg aaaacatcta ttaaacagaa 1980 

tttattgttt gttctatttt ataaccttat gtattgccac cctgactttg acatgataca 2040 

ttttgaaaac tgtatgttta tgaccatata tatataatat gcatattttt tcattttctg 2100 

ctaaacattt tttttattct ctaaaggatt gaaattagca aagtcaaagc aatctgccta 2160 

gaaatgtaca ttaaaatcgt aactttccca ttttgtggct aagtgcctcg aggcttggga 222 0 

tctgtctgaa gtcactcttt tggtggactg tgagttagtc ctggtcgtca cacctcctga 2280 

agatcaatga tctcttaccc aagatcacct tcaaaatgcg tgtggggcga atgtgcagct 2340 

agccaggtta tatttactta ttagctggca ctagctaata aagttggggc aaacaagacc 2400 

aagccaccat ctctctggct ggctcacaga tggaatgaga tttaatttga aaatatatag 2460 

aacccatttc tataaagctt tggcagttca ccctctggca ttagcattga gaattattaa 2520 
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ttaccaccct ggttccacgt 


gtgcatatgt gtctgttttg 


taagatctaa 


tgaggaagtg 


2580 


ttggatcaag tacatcaaat 


atattcttaa gaaaggccaa 


tggtaggttt 


taaaatctac 


2640 


taaccaaaag aaacacgagt 


gggacgatgt taattagatc 


aatgtagcca 


ttccataatg 


2700 


tctgcaacat gttgtagacc 


ataaatatgt attctttttt 


aatatttaaa 


atgttaatta 


2760 


atttttaaca aacaaacagt 


gtagaccttg tcaggtaaaa 


ctgttttgtt 


gggatgcaga 


2820 


atggtaacca tagtagccag 


gtcaggttgc tcagacactt 


gtgaggctgg 


ggacttcagg 


2880 


agagacttgg tcaccatccc 


ttttatctcc aggactaatg 


accctttttc 


tttttccact 


2940 


tacagttctc agtggttgca 


aaacggaagt caeigttttcc 


catgtttgtt 


gaagtgcaga 


3000 


cttgctttag gccagggata 


aattaacagt aaactcaggt 


tgaaaacctg 


gaccagcaca 


3060 


gaactgagaa ctcatgactt 


ttctaatctg ccatctgtgc 


taggacgggg 


taattacaga 


3120 


gattttgcaa ctgtttatac 


agcatttgaa attctagact 


tctactccca 


ggagatcagg 


3180 


gtcagactgt ttatgtctct 


catcagttga gtgagagtct 


ctttactatt 


ttaaatggtc 


3240 


cctccatcct gccgtgtttc 


ctgataacca accttcacca 


agagctcgga 


acaggcagta 


3300 


ataattagac attaacaaag 


gacgggtttt taagctagtc 


ctttatggtt 


tatggtacgc 


3360 


agattttagg agctgttttg 


tatttgctgg gggaaaaacc 


gtaagaaatg 


ggaatgtagg 


3420 


atgaacatac tcaggtttac 


ccaagattaa agggaaggca 


cgttctattt 


ctattttaca 


3480 


attcactgtc ccagagtgtt 


ggctgtttgg aatgcagcct 


atgccctgtt 


ggcagggatt 


3540 


tggaatcaat ttacagtgtg 


tgagatgata acacatgttc 


tggatcaggg 


gcagggaatg 


3600 


ctcggagctg taactctgtc 


tgtctcccat tagcacaaag 


tcttgcaaac 


tcgaagctgg 


3660 


agaagtgcct aaaaatgcta 


ggctcagaat gtatgtgtgc 


ctttacaaat 


gcaacacaaa 


3720 


acataggccc aggataaatg 


ccataaattg aagtnatcaa 


aaaagatgaa 


gacaaaaaaa 


3780 


cccacctcca tttaactgat 


ttcctggtat tgggtagaga 


tgctactttt 


gtgaaccaga 


3840 


aatacagctg tatgcacatt 


gacatctgtg cctagtgttt 


cgataacgtt 


caacaacagc 


3900 


ctggagactg catctgagac 


tcactcgctg gggagaaaaa 


actcggtgga 


aagcgtctgt 


3960 


gctaataagg cacagcagta 


tgtgtacaaa attcagacag 


tgtaagttga 


aacctgatac 


4020 


ccccgaggca cgaggtagaa 


gctgacctct tggctctaga 


agctaaacaa 


tcaagtaacg 


4080 


tgtatgtgcg tgtaagggca 


ggcttgggac aggcgccatt 


cagaagcctt 


tgtgaaacca 


4140 
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aatggcaagg ctcagccaaa cgatgaaacg ctgcttcttc gaaagatgag atttacacat 4200 

tgaaatgatc agaaaattaa aagttagatt tcttttttat ttaaatccta aaacctattg 4260 

tgtcctggac ttttaaatga atgccttatt gcttttactt tacctgtaca catatatgta 4320 

tatgtatgtg tgtgtgtgtg tgtgtgtgtg tgtgtatata tatatatata tatatatata 4380 

tatatataca tatacataca tacgcacaca cattctctga tgatggtctt gagtaaatgt 4440 

cagtggctat gtgattttaa atattcagac tataaaatca ttgcacaaat c 4491 

<210> 58 

<211> 2875 

<212> DNA 

<213> Mus musculus 

<400> 58 

ggcggggggt gagttttcct gtctctttat taggttgttt actaatatgt caggacgagg 60 

aaaaggcggc aagggtctgg ggaaaggtgg cgccaagcga caccgcaagg ttctccgcga 120 

caacatccag ggccattacc aagcccgcta tccggcggtt ggctcggcgt ggcggcgtga 180 

agcgcatctc gggtctcatc tacgaggaga ctcgcggtgt cctcaaggtt ttccttgaga 240 

atgtgatccg cgacgccgtc acctacaccg agcacgccaa gcgcaagacg gttacggcta 300 

tggacgtggt gtacgcgctc aagcgccagg gccgcactct gtacggcttc ggcggctaag 360 

cgagccctcc tgtgcctagg ccgttccctt ggcccggctt cccccatcca caaaggccct 420 

tttcagggcc cacaaagcat cagaaaggag ctgtggacat ttgtagttct cactagttat 480 

gagcgtctca ttactttttg tatttggtat gctttgtctt gtatgcaaat tgagctgcct 54 0 

gggtgcactt tttcattgga ggacttgggc tggtggcccc gcctaccgcc tgggacctgc 600 

tcagaatgct tgcggagtgg tctgcgtgaa tgagtttttt taggtgtctg ttgagttgac 660 

agccccagag tgctagggtg cagcctctgc ggacgctcta aaaaggggaa aaaagctcct 720 

gtagagacag tggacctgaa tgagggcagg tgtggaggct tggaacccca caataaagtc 780 

accctgtagc ttaaacttgg ctttctcagc tgtagctgtt tctctgcctt aagcacgtgg 840 

gtttggggga agaggagcac tcggatgtct accatttagt tcaggctggt ataagtcttt 900 

cggtggtcct tctgtttcag cttccacaga ccatatgtgt ctcatttaag tcttcagagt 960 

aggttagtca cagcccttgc tgctccttca aagtacttgt attcgatgct ccatacccag 1020 

atgggtttaa ctctggatcc aggaagtcgc ttccttaggc ccctggactc aggtctgctt 1080 
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accggctttc tgtacacatt actcaaaacg gtagttttgt ttttcagatc tctaggtttt 1140 

tacatatgat gtttacgggt cttctcactg gctgaagaaa taggagactg ttcctacagt 1200 

tcctggagca ggaacctggc atttcattaa gatatggtta tctcgtgttc aatgctttct 1260 

gtgtggcaga aacttggtgc ggtctttcag cagatttatt ttaagtgttg atagtgagta 1320 

cagctgtgga tgagacaacg aaggctaaaa ttttggagcc tagatttgaa ctctgctctg 1380 

tgcgatttta cacaacctcg ttttatgtgc cactggtgtt tttgtttgtt tgtttgtttt 144 0 

gttttgtttt gttttttgtt tttgcgagag tcttctctgc ccctgtgggc tgacacttca 1500 

cttctagagt ctgagcatac ataacactga taccattagt ctagtgaact cattttctta 1560 

aagcaagtga aaaatgagat gtttattttt ctgtcttgag aaatatttca gttacagtta 1620 

tatggttttt attttcttat aaagtcacat gttctatcaa actatgaaac acaaaatgta 1680 

ataataatga ttatcacccc tcttttaatc tgagttttga ttatttgaga taatcaaaca 1740 

tctataacta ttactagcaa aaagaaatga aacaaaaaaa tgaaacttaa atatcattta 1800 

ttattacaca aataacaaca gcaacaaaag tatttgtgca ttcactgtgg aaatcatttc 1860 

tctcaaaagg cttaaattag aaactgccat gtcacctgcc cggatggaaa gttacttagt 1920 

ggtaagaaaa acagaaatct tacactgaca agatgatgga cgggacagaa atcactatat 1980 

taagcaaaat tgtgtgggaa gccgccctca catttgccat tataagatgg cgctgacagc 2040 

tgtgttctaa gtggtaaaca taatctgcac acgtgcaggg gcagttttcc cgccatgtgt 2100 

tctgcctttc tcgtgatgac aactgggccg atgggctgca gccaatcagg gagtaatacg 2160 

tcctaggcgg aggataattc tccttaaaag ggacggggtt ttgccattct tgttctttct 2220 

cttgctttct tgttcttgtt ctttttcttg ctcttgttct ttttctttct cttgctttct 2280 

tgttcttttt ctctctcttg ctttcttgtt ctttctcttg ctttcttgtt cttgttctct 2340 

tgcttgctct tgctttttct ctctcttgtt cttgcttttt ctctctcttg ttcttgcttt 2400 

ttctctctct tgttcttgct ttttctctct cttgttcttg ctttttctct ctcttgttct 2460 

tgctttttct ctctcttgtt cttgcttttt ctctctcttg ttcttgcttt ttctctctct 2520 

tgttcttgct ttttctctct cttgctttct tgttcttttt ctttctctct cttgctctct 25 80 

tgcactctgg ctcctgaaga tgtaagcaat aaagttttgc cgcagaagat tccggtttgt 2640 

tgcgtctttc ctggccggtc gcgaacgcgt gtaagaaaat tggactctga aagattaaaa 2700 
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aaacaaaaag caaaagaaat gtgttttctt cgatgtagag ttgtgtgtgt gtgcatgtgt 2760 
tgtgtgtgtg tgcatgtgtt gtgtgtgcct gtgttgtgtg tgcatgtgtt gtgtgtgcat 2820 
gtgtatgtga gtcatgaaac tacaaagaag atgctagagg agatgaaata aatac 2875 

<210> 59 
<211> 3560 

<212> DNA 

<213> Mus raus cuius 

<400> 59 

catttagttt ctgaggacac aggccttgtg ttagtgcaaa tttaagagtg atagtttttt 60 

tttttttttt tgctctaatc tgtctagctt ctagataatc aagacagagg ctattagatt 120 

tattcaacaa gccttaaggc acaataactg agcagatatt aatctattct aaccctctaa 180 

actaatctag ctactttcca gcccaaattc ccaatatact tttagtattt tagtattgat 240 

tttaggattt taatattgat ttggctctct ctgcctgctc catttgtgtt ctcatggtga 300 

ctcctggtcc ctcaccacct ggagaaacct ctccttcttc ttccactacc cctgccctgg 360 

cggggactgg aagtccagcc ctgtcatctc ctctgcccag tgattggctg atcagctttt 420 

tatccaccca ccagagctaa ttggggagca gtgtttacac aacgctgaga caggagattc 480 

ttagaagaag cactacaatg ccatgtcgga attgcaacaa gatatgaggg ccccgaaatc 540 
agtatttgaa tgatacacgg atagtcgtca cacagggcac aataacatta tgccaacagc 
cttgctttta gaccactgag caaaaccagc tctccaagca tctgtggtct gaccaaagcc 

aggcaactgt gctcatccat ctcagcatac ttgttggtgg cctagaactg gcactctgcc 720 

tcatcatgat gcaaatggta gtttcaggga ggtgtgcttt ccacagagtc atttgcttat 780 

acaacctagg aggtgatgag gagatggaag ggtctggtct ggggtgggtg ggctgtgtga 840 
ctctgaaagg ccagtgtgtg ttcggaggtg gacagcacac ctggagctgc actgctggaa 
ggctcagaca tctagctgga ggagatgatt ccaattgcac tcgccacttt aggtctgtgt 
ccagaaatgt aagaagcgat ggaaagtgct agggttcctt tgtgtcccaa gctctctggt 

tctaaccctg actgacttgt agaccatgtg aatgaggctt tgaaggtgtc cttagagatg 1080 

ttttcattcc cagaagcttg gtggcttgct ttgagttgct tgtcagggac agtgcagagt 1140 

gacagtggat tacataaaac cagagtttat ccatctgtca ggtaggagtc cacagtggca 1200 



600 
660 



900 
960 
1020 
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gctgtgggaa 


gtcaaggact 


cgggcacttc 


ctgtgacatt 


ctgccaccta 


agcataggac 


1260 


tttgatgtta 


ttgtcagtat 


ggctgttgat 


ttcctgaacc 


atccctgcgg 


gagtcatact 


1320 


ctgggctctt 


tccagtagag 


gttcctgctt 


aatcctgttg 


tccagaaaat 


agtcacatgg 


1380 


gcacaaggta 


ggtggcgaaa 


tgtagcgctc 


accccagcaa 


gcaatgaacg 


gtaggtcagt 


1440 


gactgtgggc 


aaagtaaaag 


agatcagatg 


cggggtgaac 


acttacttct 


gcctcactag 


1500 


gatcttcagg 


acattcaaac 


cttactggcc 


cttaatgtgg 


aggtaatggt 


tgcatttggc 


1560 


ccagacagtc 


ccgaagcagc 


ttcctgatag 


ccgcaggaat 


tgaggccaag 


tgtcactacc 


1620 


atctcagcat 


taagactcag 


atacattttt 


agagtcccct 


ctctaaattg 


cttcatatac 


1680 


tgcctaggaa 


acacttaggg 


tgtctgtgaa 


ggtgcatatt 


cataagtgct 


ctggcaatta 


1740 


acactccacc 


caccccctct 


ttcttctcct 


cccccctccc 


ccttgtcttc 


ctcctctccc 


1800 


ttcccttccc 


ctccctccct 


ctctctcatc 


ttccccccac 


tccctcctct 


tctccatctc 


1860 


ctcatgcctt 


ctctctggcc 


taacagcaaa 


cacgatattt 


tcactttaga 


agacagtgta 


1920 


ccctgagagg 


aataaaaata 


attttcttcc 


aacctgatgg 


tgccaaactc 


ataacagaac 


1980 


aacccgaccc 


ggcattgaga 


agttgtctag 


catgcacaac 


agttcagggc 


atagcagaag 


2040 


cctagatcca 


gtgtaccagt 


gggaacagga 


cagacccagc 


cttgccccct 


gagggaggtc 


2100 


tagaatacct 


taagtcttga 


atccaaagcg 


gccacatctc 


acaggtctta 


ggagtgtttt 


2160 


tggtcattca 


gcgttttgct 


ttctcttgac 


gcagctgtgc 


tcagtaattg 


ggtgtcccgt 


2220 


ttttagtgca 


gtctcaccat 


tagcattctg 


catgttccaa 


acttcattgg 


gttttgatgg 


2280 


gcccattctt 


catacaccca 


cccgttcccc 


aacaaacaca 


caggcagaat 


ctatgaatct 


2340 


acattaaaat 


ttgtgttaaa 


ggcgggctga 


gagcaagtgt 


ctacctgtac 


tgaatgtatt 


2400 


gttactaata 


agaattaaag 


gtctggagat 


ggctcagcta 


gtcaaagtgc 


ttgccgggta 


2460 


agcctgacta 


catgaattcc 


acttgggtat 


agtgacaagc 


ctggtaactt 


ctcgggcgcc 


2520 


atctctcaat 


agctgcaatc 


ttgacaagtc 


accatgttcc 


tcgatatccc 


aggttttagt 


2580 


ttcttgaggt 


acaaacagag 


gtgcctccca 


tgtgctattg 


ggagaactgg 


cctgtgcttc 


2640 


agggtctgtg 


tgggacagat 


ccagcctgca 


tggtctcagt 


gacacacatt 


tcatttgtac 


2700 


atgcatgtgt 


gtgtctgtgg 


atgtcttggt 


gtgtgagtac 


gtgtgtgtgt 


ctatgtgtac 


2760 


atgtgtgtgt 


atgtgtgtgt 


ctttatgtgt 


gtgtgcatat 


gtgtgtatgt 


atatgtgtgt 


2820 
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acatgtatgt gcatgtgtgt atgagtgtgt atgtgtgtac atgtatgtgc atgcgtgtgt 2880 

gtgtgtgtgt gtgcatgtgt gtgtgtcttt atgtatgcat gcatgtgtgt gtctgtgtgt 2940 

atgtgtatgt atgtgcatat atgtgcatgt gtgtgtgtga gttgtcattc cagttgatgt 3000 

ctaggagttc cggaatgaaa tgatagaacg tagcagttgg atttgtatat caatcttgtg 3060 

tggaactcgt ggccaaagag gcagccctga ggaattgtgt atgtatgact ctgcatgtgt 3120 

gtcatcctca gcacttgatg gcagcaccga ctctcaggac ctgtctcaga catgtagatg 3180 

gaggcccagc cattggtgct ttaaccagat gtgcagagga cctgaatgtg cagtattagc 3240 

atctgagagc tggcaccaac ctcagacctc attggatctc atatttcctt gtggaccccc 3300 

tgcccccctt ctgactcagt gattctggac ttcagctcag gctgagactt taagtaggac 3360 



tctgctctgt ggagctgggg aggtaggata 


gctcagcagg tagagtgctt gctgtgcaaa 




ccctgaggac ctattcaggc atgttgcttg tgtttataat accagagcct tggcatagga 




gacaggtgaa tccttgggtc tgctggccag cccatctcac ctaatgtaca gattccagtc 




caatggaaga gcctgcctcg 






3560 


<210> 60 

<211> 2334 

<212> DNA 

<213> Mus tnusculus 








<400> 60 

taaaaaccca aattggaact 


atctcctccc 


tttacccctt tcccttaatt cctattctga 


60 


tgacactttg gacatgaatt 


caaggggcag 


atgaattttt ggcattgtgt tttttcttac 


120 


tcaaaatctg tttattgggt 


tactgccccc 


ataaaaagta aacatgactt ttaagtattt 


180 


ttttataaac agctcaatat 


aaaacataga 


cagtgtttga ttattttcct tgtgtaagtt 


240 


tgatttaaaa cgttggaaat 


gtgtgctttt 


tagtgtttac taaagtgata agaaaataaa 


300 


gcattcaata cactatagat 


tccaaaacat 


aacattgcac caaatagaaa tgtatatttt 


360 


attatgcaat gccttagtca 


taaactgggc 


tcaaacaatc ctcagcctaa aacactgttg 


420 


tcttttaata tgcttccaac 


ccaaaggcct 


ttcatctcag tatctgtcaa acttgaataa 


480 


cgtcttctct ttactattac 


acacgaggca 


gcctattaac ctgtgtctta gaattgttgt 


540 


atatagttct tttaatatcc 


atagggtata 


tttttgaatt ttttggtgag tatttgttga 


600 


atttgagata gcatacagta 


gatgtttaag 


aaatagtagg aagtccggtg agacggctgt 


660 
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ttccatcaag aactcctagc actgctgtaa atatcatggt gcctactgga agggaatgta 720 

gatgctatgc attttggaaa taatctgcat ctgttaaacc tgcagaagtt ttttaatgcc 780 

actttaacac taatgcactg acagattcta aatattttgt gagaaatgtt gaaatgttta 840 

acctgatagg cttctctata aaagagtgtt ttgttttttt ccctgcacca caagctgtgt 900 

ttatcacttt acagttgcat gttcaacttg tcatagctgg aattactgta taaaaagaac 960 

tgattgtgac ttgtagttct tctctagagg atgctgctag aacgtggttt tgctttgcat 1020 

tttgtagttc ttcccgtcag tgctgtgtgt agtcctgctt cttccagtct gcagtattca 1080 

ctagaggcgc cctttgcatg ttgcactctg ttctcatttg gaggttggac tcagaccagt 1140 

tagcacagta ttctctcatc tgtgtcactt tgtaaaacta actgtactct gtatttctta 1200 

tttgtacata tcaatgtgag aaatctccct tttttatgtt gcaattacct tgtgatcagg 1260 

cagcttgagt gctatgcaaa tagtaagtag tgtagtggtg atttttcttt gcatgttgtg 1320 

tgtgatatac ctagccagaa atagatgtgg cttttgtttt gggggcagat tactttcaaa 1380 

agcaaataca attcacttga atttgacaaa ctgaagcaga caagtgttct gggtcctctg 1440 

ataatttggg gtgtttggct gtcagctagg ctcatgaagt ccactgtact gtaatgatgg 1500 

tatttacctc tgtgctattt taattadcct cgcgtctgtg gaactgctga tttgagtagt 1560 

gattagcatt tagaaatttt gtaatgtaga gttttagaga gagcactttg aaagataaac 1620 

attttattat gatggtgcta ggtacaaaat ttatagcatg ggatgtgaag aaaaaaaatg 1680 

agaacccatt gaaaggaaga aaggaatttg ttgtctgctt ctaagctaga gtggttgtaa 1740 

aggttctgct ctgccagtgt tcagtatcag tggctgaatt atggataaga actgtagaga 18 00 

atcttctgtt tagtccgtgc tttttatgta gaattggttt tatctaatag ttttactatg 1860 

gaaatcgcct tttgatatta aagccagatt ttagaggttt gatatgtttg gtctcaggag 1920 

ctcaaaagaa gtagcttttt ccagtgtctt ttgtgttact gatttgggaa atgttgaaag 1980 

attggacagg gaagaatagc gcttggtgtc ctcatggtca ttctgtctta ccttagtggc 2040 

ttgacagtac ttactatagc tcctgaggga agccaatcaa cttctgtttt cctacctgac 2100 

Ctgcagggca tgatggatca gtgatgaaag aattgtagcc tgtggcactt tgttttgacc 2160 

tctggtatag aactctgact tttatagttt taaaatgatc aatctttgta tgaaagtcag 2220 

ttttctttct ttaggtatca gaagattttg ccttattctg aggtcggact agggccaagc 2280 
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aagctttttc cttcatgtga gctgtcatac tgtattttga cctctttctg cacg 2334 

<210> 61 

<211> 4052 

<212> DNA 

<213> Mus tnusculus 

<400> 61 



atggaaatga 


tttttctttt 


tttccccatt 


tgggatgatg 


ttggctgtga 


gcttgctgta 


60 


tattgccttt 


■ ttttttaggt 


taagctatgt 


cccctgtatt 


cctagattct 


ttgggacttt 


120 


tctcatgaaa 


gtatgttgga 


ttttatcaaa 


ggtttttctc 


tgtctaaaga 


gataatcatt 


180 


agattttcat 


tactcagtct 


atctgtgtag 


tagaittatac 


ttatttctat 


gtatgaggta 


240 


gattacattt 


atttatgtat 


gtatattgaa 


caatctctgt 


atccctgggg 


tgaaactgac 


300 


ttcattataa 


tgggtaatct 


ttatgatgtg 


ttcttgaatt 


cagtttgcaa 


attgaaatgt 


360 


aaacttaagg 


gttttttgga 


aagtcagttg 


tgttggatgg 


tattttcttg 


gagcaaacac 


420 


ttgaaggagt 


attttactga 


agcagacaca 


gatgaaagga 


tgttttgcta 


aatcaagcat 


480 


gtgggacatg 


tgaaggattc 


atcactaatg 


agatgcctat 


attgatctga 


cttacactgc 


540 


acagctgagc 


tctgtttgtt 


gtgacttcat 


agaattgcat 


caaaaaeiact 


aaacaaaaca 


600 


aaacaaaaaa 


aaaaacttta 


ggctgattgg 


caaagtgatg 


tcagctgata 


cagattcaag 


660 


tgaagttttg 


ctaagtcaga 


ttcacatgct 


caggcaagat 


gtggtgaggc 


aagcaagacc 


720 


catgaaggac 


atgtaatact 


tggagggaat 


gtaagtagga 


ctcaacaggc 


tgtgagagag 


780 


gcttgggtag 


gcttggcttg 


cttgctagta 


gagctagctg 


tacaaatcat 


cttcacatct 


840 


tcactgagag 


aggcacagcc 


aagaacttct 


ggcattcccc 


ttggccttga 


ttcttcctgc 


900 


ggaatcgtgg 


cgattaggct 


gaggcctggc 


tgtctctgct 


aggtcatgct 


accactgatg 


960 


attcaagttt 


gctattctga 


ctctattgaa 


ctagctggtt 


tgttggtata 


ttcatgaagg 


1020 


atttgcaagt 


ggattgagct 


gccattgctg 


agctgaactg 


aactgctgat 


ctcctgacaa 


1080 


tgcagatggg 


atttgctcca 


aageiaccatt 


tctaaacagg 


tgcaccccct 


ccccacctcc 


1140 


atccaaccct 


ctgtatcctt 


tcttttcctc 


gacctctggt 


gggtggaggt 


ggaggactac 


1200 


aaagaaagtt 


aaagggttta 


agaaccaaca 


ttaaaagtag 


gccctgaaaa 


aaattaacat 


1260 


tattctaagg 


ggttttattg 


agagtttttc 


cattcattca 


tatttatcaa 


agagattgaa 


1320 
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ctatgattct 


cttttgtgta 


taaggaggtc 


tttctctgat 


tttggtataa 


gggtgatact 


1380 


ggcattgtaa 


aaatgatgaa 


atgtccttta 


atttcctatt 


ttgtggaata 


atttgaggcg 


1440 


tatttgtatg 


agttcttttg 


aaggtgttag 


aaatctgtgg 


cgaatccatc 


catctggtac 


1500 


tatgctgttt 


tgaattggag 


gattttaatt 


actggatctt 


tctcgttgga 


tgttatagtt 


1560 


ctatttaaat 


tatttatttc 


atcttgatat 


agatttagct 


tttataggtg 


atatgtatct 


1620 


agaaattaat 


ttatgtcttt 


cagattttct 


attatgttag 


aatatagata 


tttttaaaaa 


1680 


tattgtcatg 


attctctgaa 


tttccttttt 


atataacttt 


ctgtatatta 


taatgactat 


1740 


taatttgaat 


ccttcttttc 


attgatttgt 


ctaaaggttt 


gtcaatcttt 


tcaaagaacc 


1800 


aacaccttat 


taaattgatt 


ctttgttttg 


ttttgttttc 


tatttcattc 


attttcagag 


1860 


ctgatttgat 


tgtctctgcc 


catctacttt 


gttttttttt 


ttcccaagga 


ctttgtgtgt 


1920 


attattaatt 


tgagatgcct 


tgcttgcttg 


cttgcttgct 


tggttacttg 


cttcctttct 


1980 


ttcttctttc 


tccccccacc 


ctcttcctct 


ttctctttct 


ttttcttcct 


tccttccttc 


2040 


cttccttcct 


tccttccttc 


cttcctttta 


gctgtagtgt 


tatggaattt 


tttcttagga 


2100 


cttactgtct 


tcattgtgtc 


ccatagattt 


tgggatgtta 


tgttttcatt 


cagtagtttt 


2160 


aggaattaat 


aaaattctta 


atttctttct 


ttcttcattc 


agttaacact 


cagtgagatt 


2220 


ttatttattg 


tctacaagct 


tgtgaactgt 


ctgtggtttc 


tgtgattgat 


acatagcttt 


2280 


ataccattgt 


ggcaagatag 


gaaatggggt 


gtaatttgaa 


gtttgctgca 


tattttagac 


2340 


ctttttttgt 


atcataataa 


gtagttgatt 


ttggagaatg 


tttcactggc 


tcctgagaaa 


2400 


ttattcttgt 


gataattagc 


ttaaaatcgc 


atgtaaaaca 


aagcatacct 


gaatctcttt 


2460 


gatacagatt 


atgtgtgatt 


acatcaaaac 


actttataaa 


gtatatatga 


cactctccat 


2520 


aaagataaat 


tttgtagatt 


ggtagaaatg 


tccgtgagct 


acacatattg 


aaacaaaggt 


2580 


caccaggcta 


gagagcacct 


ccagtccagc 


cttctcagta 


gaaagggggc 


tttgcattct 


2640 


ttctctgaaa 


gaacatccct 


gtgtgcttga 


tattctgggt 


tcccagagta 


catatgtgta 


2700 


agcacaaaga 


tacctctgta 


tttaatgcgt 


agttttgtta 


ggtctagtaa 


tcttgtttga 


2760 


aacatgggat 


tgttcaggac 


ttggaataca 


tgcaggctct 


tctggcttta 


atatgttcta 


2820 


ttctaatgta 


tgtgccttta 


tctctctcat 


agagctttca 


gtgctctatg 


agagctcttt 


2880 


agtctgtgtt 


cttaatattt 


gaaccataaa 


atatgaatta 


ctttcttctc 


tggacttgtc 


2940 
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tatttggtat tctgtacgcc tgttttgtac ctagatgagc ataattttag gtttgggaaa 3000 

tagtcttgta tggatttact gaagatattc tatgcatttg ctgtagcatt ctgtatttct 3060 

ttttatgcct ttaattcagg aagttgaggt ttggtttttg ttggtttctt tctttctttt 3120 

tttttttttc cttttatccc cagtttatag aattttggcc ttctgatagt tttggttgac 3180 

tctctgtatt cttttggtat gtgtttcatc acttttggcc ttggtctata ctggtttttg 3240 

ccatgtgctc aattttattt tggcttttta cattgtagag ctttgggata cattattagg 3300 

ttatttatta gtgtgaacac tttcagccat aacctttcct ctaaggattg cttaggtatg 3360 

tcccagaggt tctgaaaagt gttgttttca ttttcattat gtttccattt gttaagacct 3420 

ccaaggtttc ttcagtttaa aagtgtgctg ttcagtctcc aagtatgttt tctgaggtta 3480 

tcattgtttt tgatttctag tttgattgta ctggggtctg acagatgcag gaatcgtttg 3540 

cttttccttt cctcttcctt gcatccttcc ttttgctttg tgttctacaa tgtgttcaaa 3 600 

agagtgtcat gtgccactga tgaaactgca ttatttatct gttaggtgca atgtttgtag 3660 

ctctaccaag taagtttagt tcacctgtgg tatgacttaa ccctaaggtt tctttgttga 3720 

attttttttg actgttgttt ttggtgggta tttggggttt tttgttttct tgagactggt 3780 

tttctctgtt taactttggc tgtcctggaa ctcactgtag agtagggtag atctcaaact 3840 

cccagagatc tagctgcctc tacctcctga gtgctgggaa ctaaaggtgt gtaccaccac 3900 

ggatggtctt tgctgaattt gatttggagg acacatctac ctagtgtgaa agtgaggtcc 3960 

tatctccagg cctctgcagt gatgtgtacc gtttaatgtc aggactcaga aaatatgtgg 4020 

acccaataaa gcctcagtgt gtcttttgag eg 4052 

<210> 62 

,<211> 1815 

<212> DNA 

<213> Mus musculus 

<400> 62 

agtttttact tattccctct tattgagact gttagtcctg acctagttag ctgcattatt 60 

tattccctgt tagatcacag catggagtcg ttgaaatgat actttcagtc agcgtcactt 120 

ctttccacag tgcagaaagg tctggaaaac ctcaggcacc ttgagattca gcctgtgtta 180 

gggctgcagc ttgagaaaaa caccgtgtga ctctgaaaca tttccccgac ggctgaattt 24 0 

ccttttctgg gttttcctcc gtgccacaag ctcaggacca agtgtaagaa gcagcccttg 300 
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aaaaatggca gcagggccta gcaatctcca gtgtgagcct tcctgcctgc ctcagcgtga 360 

catcggggga cactgaagtg gatgctaatc gtgtgggctg attccttgca cctgctggct 420 

gaccccaggg gaaatctgtg actagcaacg tctggccagt gtctgcctac tttattcagt 480 

acgcttattt tccatcccca tagaaggaga aaaattagta gagacatttt attcactttg 540 

catttgtgta catatgtata aatttactct gagttcacgt gtgtgcatgt atgtgtgagt 600 

gcatgcatga tcgtgtgtgt gcatgtatgt gtgtgtgcat gtatgtgtgt gtgcatgtat 660 

gttcacgtgt gtgcatgtat gtgtgtgtgc atgtatgtgt gtgtatgtat gttcacgtgt 72 0 

gtgcatgtat gttcccgtgt gtgcatgtat gtgtgtgttc atgtatgtgt gtgtgcatgc 780 

agaagccaga ggtcactgtt gagtgtctgc ctgtgttgca ctccaccttg ccatttgaga 840 

ccgggtctgt catcaaatct ggaacttaac aattgagata gaatggctgg ccagagagct 900 

ccagagatct tctgtctcca taaagcactc tgtccccccc ccccccccca gttcagtgct 960 

gtggttatag aactgcgaca caagcttttt atatgtgcac tggggatgca acccagttcc 1020 

ccgtgattgt atggtaggca tgttactcac tgagccgtct ccctagcact taagtttacc 1080 

tctgaattgt tagactctag cgtgtctgta gtgaagagcc cattaactag aggagtagtg 1140 

ttttcttagc aacctcaagt atggtaaaat gttatactgg ctagcttgca tcagagggaa 1200 

aacacttttt tttttttaac tttttgggat ttctatgatc attatttcaa atcaggaaga 1260 

aaggaatttt ccttgtggaa agctaacatg gtttgaagta tttggctctg cattgcactt 1320 

gtctagagcc tgtttgcaga tcagagttta cgcagtgccc acacacatga tctccttcca 13 80 

ttgtgatgat catcctgtaa aatgggtagg acaggaggca ttatctcctt ctaaggtgag 1440 

aaggcgggtc ctggaagatt aaatgactgg tttgcagtca gcaggcaggc aggtcctgga 1500 

agattaagtg actgatttgc tgccagcagg gcagaaatga cagaaacgat acagaaggga 1560 

tccaaatcta cttcttgtgg tttagttccc ccttctttgc atgatacagc aacaactcag 1620 

tacagtccag agatgcagtt tatttattta tttatttatt tatttagact cactggtgtg 1680 

tctgtgaaca gtcatcatga gtcgtcgtct ggcatagtcc ctatgagcac caggatttgt 1740 

gaatatcttt attttgacag gaacggeiatc accactgctg tgtcccttaa taataaaatc 1800 

tgtatttggt tcacc 1815 
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<210> 63 

<211> 1727 

<212> DNA 

<213> Mus musculus 

<400> 63 



gtcgtacaag 


tccaaaaagg 


gaaatagaaa atggagcatg ctgtagaata gaccttgaat 


60 


ggactggaaa 


gtctcttgtg 


aggagggtgg ccttatttaa ggacacaggg attttattta 


120 


agggttttta 


aagtggagta 


ggagaagaaa gcagagaatg taatcaaggt tgtgtaatgt 


180 


ttgactttga 


aaccaaataa 


gattcataag aattttcaaa aatagtacaa agggtgttat 


240 


gtgtcttcac 


cctgtttctc 


ccagtcagag aaccttacac aatcataata tagtgttgaa 


300 


aatacactta 


tgttgctgag 


tgtcattggc taagatgtag atatatttcg attttataat 


360 


ttattttagc 


ttaaaagttc 


aaaacaatgc atagaaagga ctgaagccat ggctggaact 


420 


ataaattcac 


aatcttagtt 


ggagattaag gagcaaagcc cagttggtag atgagcgatt 


480 


gtttgccaag 


atagttagaa 


gggcteiaaat gatgtctcag ttggtaaagt gtttgccatg 


540 


caagcacaaa 


gacccaagct 


tggattccca gcactcacaa agaaagatga acatgatggt 


600 


gtgtgcctgt 


aattctaggt 


ctagggaagt ggagacagga ggagctcact ggttatccac 


660 


tcttgctgaa 


ttgtttagtt 


ccaggttcaa caagagactc tatccaaaat tataaagtgg 


720 


agagtgacag 


agaaggacag 


ccgatcaggc atgcacatgt ttgtatgttt gtgttatcta 


780 


tctatgtatc 


tatgtatcta 


tgtatctatc tatctatgta tctatctatc atctatctat 


840 


ctatctatct 


atctatctct 


gtctttcaca gacacacatg tgtatacaca catacatcat 


900 


acacaaataa 


aatctagatt 


actagaaata tttagaggta gaaatgctct atatgtctaa 


960 


gtgactacat 


acttgtgagg 


cagaagagat aaagtccaag gggagctatc agtcagtgtt 


1020 


tatgcttcta 


actgagaact 


agaagagaag cagatttatg gaaaaccaag acactttcac 


1080 


cgtgatcaca 


gtgacccgca 


gacacctgtg taggggatgt tcttactacc tgaagctcaa 


1140 


gaaagatggc 


ggctggagtt 


gggagggtta aaggtttagg catcctacat gtatgggata 


1200 


ataaacaacc 


tggttaacac 


tgtcactttc ctgtgattct acttaaagtt atgattcaaa 


1260 


gatttcctgt 


gggttttttt 


ggtgtgtttt gttatgcttt tgtttttaaa gcttagcatt 


1320 


tatattagtg 


taaactgaat 


tcatgataaa gtgtgatgtt ttctttaaca ttccctccaa 


1380 


aatgttgctt 


tgctttggga 


atactteiaat aatgtactgt ttaggttttt ccctcaagtt 


1440 
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agcaaagata 


caataatctg 


ggaaggagga 


aactcaaccg 


agaaactgcc tccatcagat 


1500 


tggcctataa acaagcctgt 


gggtgcattt 


tcttgctgga 


tgattgatgt ggaaagaccc 


1560 


agccctctat 


ggctgtggct 


cccctggaaa 


ggtggccttg 


gtttatataa gaaagcaatc 


1620 


tcagtgaagc 


aatctgaggg 


gcaagtgtaa 


gcaccagtgc 


tcccccttcc gtgtgtcagg 


1680 


acggatgtca gttgtccaga 


agatttctga 


gttcacttct 


cagaatg 


1727 


<210> 64 

<211> 2314 

<212> DNA 

<213> Mus musculus 










<400> 64 
ggttgattcc 


caagtctaca 


ctcattcaca 


tctttgtgtc 


tttgtttatt tgtttttatt 


60 


tttttaactg 


aaattgtttt 


tacacaatat 


attcttgtca 


tggttcctct ctgcctcctc 


120 


cctctcagcc 


catatcctcc 


cgacttcccc 


acccaaccac 


ctgactccat gccccttctt 


180 


cctctctcgc 


tttaaaaaat 


aacaaacaag 


ttactccctc 


ctctcctatg gtctcaagac 


240 


catctctctc 


agtgttacac 


agcattcccc 


aacacagcat 


tcacctgtga aggctgggtc 


300 


ctggttttgc 


tacatggact 


ccctaactct 


tcaatttcta 


tctatctatc tatctatcta 


360 


tctatctatc 


tatctatcta 


tctatctatt 


tatttttatg 


tatgaattta tttattcact 


420 


tttaaaattt 


atttattttt 


attaggtatt 


ttcttctttt 


acatttcaaa tgctatccca 


480 


aaagtccccc 


agacccgttg 


aggaccacat 


ctttttgttc 


gattgattct atttatgatt 


540 


ctacaccaac 


tctcctcgag 


cacgagacct 


tgcatgtcct 


tatggtgtct ttcttaggtt 


600 


aaatttgtta 


gcctggcatg 


agaagttaag 


cgtatgcttc 


taccttaagg cattcatgag 


660 


gaatagctat 


ggagagtgaa 


tgatatttca 


gtgtttttgt 


ttctcccacc tggagttttt 


720 


gtcagttgtt 


ttatgaatgt 


tttaaagtgg 


gttcgacttc 


tatttccact tctatttgca 


780 


aaatgtgtgt 


ttctgtctca 


ctcgactctg 


ctcacctgta 


caagtgaaat tgtttctaaa 


840 


cagaagcatt 


gtgactgtgg 


gaaggtctac 


ccatgatcat 


gggaaattca aagttcagat 


900 


tcagacaaac 


ttaagcaaat 


ctgtatactt 


agcactgcag 


ataaatggga gtgtatttca 


960 


agccaaacaa 


agatctggct 


ttatgaagac 


ac agate tea 


ttaatctatg tcaaattgtt 


1020 


gtttaaatac 


aagccttttg 


gaataattca 


agctcgctct 


tgcaggaatt aatctatctt 


1080 


cactttgcca 


tttgtaattg 


ctgacctgac 


aggcatttac 


tggatttgtt ttgtagtgta 


1140 
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aagtatttaa ggtggaaaat taacatcatg agacaaaatg tcaccttttg ggtgaaaaca 1200 

ctcagcgttt ttacctcctg aagtggcagc ccaattcttc tgatggatgg gtacacctgc 1260 

tgctttcaga aaaggcttcc agtgggtcat gaagtctctt tttgaaaatt ctaaaatata 1320 

cagtatgatt acacctgagg ctgcaaggtg tgtgtgtgtg tgtgtgtgtg tgtgaatgca 1380 

tttatatgct tatgtgcctc tgcatacata tgcatagaag cttgtggaac tgtctggact 1440 

ctaaaaaaaa atacacatca tcattttgtt ctttgtcaaa cacaattctc aaaaagagtc 1500 

tagtgtaatt aatttaaaat tcaatgagca tttgttgggc acaatctgaa ctgaacagtg 1560 

gaaaactgaa cagattgaga gacttaagga cttcagggca cagtggacag atcaaccagg 1620 

taaggaagta actgaagatc ggggttactg ggatccacgc aatactgtca ccagtagtgg 1680 

taaatacaga ggacagagtt agcactgatc agtctaggag aaactgtaga gcaaatacca 1740 

tagagaaggc tcatttttat ctaaccttga agaatacacc cacctggaag tattatggaa 1800 

ggaagaaagt gttaaaagac acagaaacca tgaagtagca tgaaatattc ataaattgct 1860 

gaaggcagga gttgggagac agcccttgtt ctggttatgg ggctcttgga agggacaacc 1920 

gagaatagcc actattggcc acctttgtat taaattcaag gctgaaaaac tcaaggggtt 1980 

gcagaaagtt tctgcagaag ctgcctatga gtctgttgct aggactgact tgttagtggt 2040 

cagtcttggc tccgtttcca ttctttcaag tgtggcgact cccaggaaaa taatctacca 2100 

tgtgttaact gtgctgcacg atttcctctt cacgttatac aagtgctaga catttgttga 2160 

ctaacagaga gatgttagag agatggaaga tggtgcttta aggcatttcc agtttgtgag 2220 

agattaggca taaacgtgca aagtactcac aatgtagggc ttcagtgaca gtcgcattaa 2280 

aagtaatttg agaagattgg ttggaacaag ctcc 2314 

<210> 65 

<211> 3368 

<212> DNA 

<213> Mus musculus 

<400> 65 

tgagtggtga gtgactcggg gcggctccaa acaagctgga gggcttggcc ccgccttcct 60 

cttgctctgt ttttgtgggc ggtctagccc aggcctcatt ccacgctcag tccagctcag 120 

tcatccctga gtcttcagtg tttccagatt tagcctgtta ctccaccacc ctcttacacc 180 
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tccactcatc 


tggttactgc 


ctgtctttgt 


tactgagctc 


ttcactaccc 


ccatccatct 


240 


aaaatactcc 


cgaaggtcag 


cacacttctg 


caggggaaat 


ggtaagagac 


agctcagcca 


300 


ggaagaatgg 


agaaggctgc 


ctatgccccg 


gtagccatgg 


tggatggcag 


agaggtgtgc 


360 


cttggtccac 


agtgtgtggc 


ctagggcttt 


tgtgtattgg 


gtccatcttg 


gtgggaaggc 


420 


tgagacaaaa 


gaacttcagc 


ctggagttag 


ccaaagaaaa 


gctgggcatg 


gtagttcgtg 


480 


cctgtagtgc 


caggattcag 


gaaggttggg 


gattttgagg 


cctgccagag 


aggtacatgg 


540 


taaggtttgg 


gaaaagggga 


aggggagaag 


gggtgtgtat 


gggcagggat 


ctaagacagg 


600 


gaagcagcgg 


gacttggtat 


gggcatcctt 


gatggacggg 


tgctttaggg 


taccctttac 


660 


tcgctcctag 


gaacaggtaa 


tggggatgtt 


tagaggaggt 


ggggtgggtg 


aaagggtccc 


720 


acagtgtgct 


tcatggattc 


tcttggtctc 


ttgcagaaac 


gggggatatt 


tatatctggg 


780 


gctggaatga 


gtcagggcag 


ctggccctgc 


ccacaaggag 


tgggacagag 


aacaaagcag 


840 


agagagagga 


agccacagaa 


ttgaatgaag 


atggtctcaa 


agaggaatta 


gctgtggctg 


900 


atgcaggagc 


tcctgcccac 


ttcatagcca 


tccagccctt 


ccctgctctt 


ctggatcttc 


960 


ccctgggctc 


agatgcagtt 


atggccagct 


gcggatcccg 


acacacagct 


gtggtgacac 


1020 


gcacaggaga 


actctatacc 


tggggctggg 


gtaaatacgg 


acagcttggc 


cacaaggaca 


1080 


gcaccagctt 


ggatcgaccc 


tgctgtgtgg 


agtactttgt 


agaaagacaa 


cttgaagtaa 


1140 


gggctgtgac 


atgtggaccc 


tggaatacct 


atgtctatgc 


aatggaaaga 


gacaaaagct 


1200 


gaactatccc 


ttagtggatc 


cccacttatg 


tgcctggttg 


ctgtatggac 


caatgacagc 


1260 


cccattaaga 


cagcaggacc 


agagactcag 


ataaaaatct 


ctgcagccgg 


agctgtgaca 


1320 


agggaagcaa 


tagttatatc 


aaaacagcca 


aggctgtcca 


agtctctcag 


aggaaaatgt 


1380 


ctggacccat 


gagaagagaa 


agcactttta 


tgagattctt 


atatattcat 


attcaaatct 


1440 


tctctatgta 


gcacaggctg 


gcttccactt 


tgactcaggt 


tcctgaatgc 


tggattatgt 


1500 


ttagagatta 


aatctaggac 


atcacgtatg 


ctggtaacac 


ctcctaccac 


tgaactaaac 


1560 


ccacagtggt 


gtagataaca 


attttgcatc 


aatccacaag 


atgtctggaa 


ctccctgtct 


1620 


tgaactcctg 


agatctgtag 


attacagcct 


cccgagcaat 


gcgattacag 


actatactca 


1680 


agagatgaga 


agcccacacc 


attacatgct 


cagcgtttac 


acagagaaga 


ggccgagtag 


1740 


ccgcgacttg 


cggcggcgcc 


tgagctcggc 


cgccagctac 


gccaggcctg 


agacaaaccc 


1800 
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accccggcct cgaggaagcg tgtcaacccc gcgctgtccc cgctgctcgc cccgctagtg 1860 

gcctcgggca cagatgccct ccggagccgg cttcgagcgc cccagccaca gacatcggtg 1920 

cctgaagacc ggcgaaccca gcttgtcgaa cctggtacca gacccgcacc tcagtaatgg 1980 

gtcccgcccc taagagaccc gcaccttctc ctcatggtgg acgctccaag ctgtccagaa 2040 

tcgagaccct ccacttgcca cgcgcgggac ggagaacggt aaccagccgc agcagacttc 2100 

acccggagtc cgtctccact ctcgagccgt tggaccccgc ggaattcaaa ccgagccaga 2160 

ggcggtgcca aatgacaatt ggttaccgcg tccgccactc acggagcccc gccctcgtcc 2220 

cgcccctcct gcgaagggct gctgcctagg cctgggatac agagaccgcc tagggcgtgg 2280 

gaagcgcttg cacggggagc gtgcgggcct caatatgcgc atgcgtgcac ctatgcccgc 2340 

ctcaagggtg gggtgtaggg gtgtggccga tgatgtcacc ggtacccact gagttgctgc 2400 

cgttgggttt caaatttttg tgcttcctga gaagtattgg gtagactgcc agaacagctc 2460 

gacttcttgc tactctccaa tgtcccaaga ggagcaggga ctagcaggaa agcttagaag 2520 

gaaaaaaaaa ccaggctctt ggtcctggta ccttagtgga caagtttatc gcatttaatc 2580 

attagaacac acctgttaat aatgatcatc actatcctga agaaaaagat ttgagtagct 2640 

tttccagaac gtaggcggca gagacaggac agggaaatag tttgggctac aggaaaaaac 2700 

tgacttgccc aatttagctg tgagcaaatg gtgatttggg ctttgttttt cttgcacatt 2760 

ataagattta ggtaaatcag gaaaactcca caatttgaac atgacatcaa agtatttgtg 2820 

tagccgggcg tggtggcgca cgcctttaat accagcactt gggaggcgga ggcaggcgga 2880 

tttctgagtt cgaggccagc ctggtccgca aagtgagttc caggacagcc agggctacac 2940 

agagaaaccc tgtctcgaaa aaccaaaaca aacaaacaaa caaacaaaaa acaaaaacaa 3000 

aaaacaacca aagtatttgt gtgtcctttt cttcccactg acccatagca tctgtatccc 3060 

attttcaatg atttactatt acctaataca ctgctatggg agtatctcct ttgctactca 3120 

gactaaaatt tgtaataaag ctgagatgaa tttctttttt gctacctcct ccaaaatgat 3180 

ttgtagaagt gtaattattg aattctcctt tgcgaagagg tacttattaa cctatcttct 3240 

gcaccaaact ttacctggag tatccttgtt acccacagca ctggtgatta gcaatgcgct 3300 

tttaaaccac tagcttccat ctataggccc aaaataaaac aaaatctagt ttatactgaa 3360 

ttataacc 3368 
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<210> 66 

<211> 1763 

<212> DNA 

<213> Mus musculus 

<400> 66 

gagacttgag cccatcggga ggattgtgtt ctggggcctc ctcaggcagt acctctggga 60 

acaggggctt ggatcggtca atgacgaaca tctcgtgtcc atggtcatca cggagctcct 120 

cgagctgtgc aaaggtgcaa ggcccatccg aataatcgtt gacaaatagg gtgccttctc 180 

ccgacggcgc aggcaccttt ttccgccggc gtcggcgtcg gcagatcata gctgccaaca 240 

acagcgcggt gagcgccagc agcgctatgg ctgctgtgat cgctgtctgc gttgctaggc 300 

ccagggctcg gaaagccatg ctacctgcct catgctgggg ttcatgtccc acagggcgtg 360 

tagctggggc ttgcgggtca ggtagctgct gtgactgctg ccgagatgcg ttgaccagta 420 

gatgaaaggg cactcgagct ttgccacctg cattggcggc ctcgcattcg tacttgcctg 480 

cgtgagccag tgtgatgttg gtgaggaaga gcatgccgct gccagtatct cgggtgccgt 540 

gcccgcccaa gcctggcgcc ccaccttcca gctgggcctg ggcttgaggc ttaccatcgc 600 

gaggctgggg cacctttctc cagaccacca ggggctgtgg gtagcctgat gcctgacagg 660 

cgacctgcag gtcctccccc agattagctg taaactccgg gggctccacg ttcacagagg 720 

ggggaatgca gatgaggcta ccaccagata cttctagtag actctgtagc gccaggcgtg 780 

ggggctctgc acatgtgatc ttcttgtctc tggagctcag aagccgtcgg cccccctctt 840 

tgatccaaga gcccagccag tgaagggcac agtcacaacg ccatgggttc tctgtgtagg 900 

gcagaataga gcctaggtag atggatccta ctaatccaaa cagaagccct tatctatctc 960 

ttacagatgc ctatagggtt ctagaaccct tgtcacctct tccctcactc caacttccta 1020 

ggaatgtttg tgatgacagt tcctgaaaag gcgtcagctt ctcctgtcct ggctttgcaa 1080 

gcctctgcat ccctcagccc ggcactgaaa ccctgtcgtc acatgagctc tgtccagttt 1140 

tgcctggctc cccctgatga caagttagaa gagcctcaaa tctctacaca cagcgttcta 1200 

ggacaaccag tgcctcacag aaaactctgt cttgaaaacc aaaacaaaaa agagcttcaa 1250 

ttctgttaag ttactctgca gaacttgtac cttgtttcct taaatagatg cataagctgt 1320 

agctcctaac agtactccct ggtagtccct tatcaaggcc gaggttcttg tgtggccctc 1380 

caaagtaatg cccactggct gccatagtga agcaagacag acaaggggaa aatatagggg 1440 
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cacaggaagg gaggcaagtc 


cccatagaat 


ctggctgaag 


cctctcacca 


agtgacatca 


1500 


acattaactt tgaagattgc 


caacataaca 


atactccttt 


tacatttaga 


agaatattaa 


1560 


aacagcagtg aactctagga 


aaggcctcgg 


catgtatctg 


tggtcctcaa 


ggatgagctt 


1620 


gctctttttg taggtagtaa 


cgagactaaa 


cttttgttta 


cagtacgggc 


acgttccctg 


1680 


gcaacccctg ttacttggga 


gcatgcacta 


tcttctgtat 


ccctacccaa 


tgtgcaggac 


1740 


ccacgctata aatgattcca 


ccg 








1763 



<210> 67 

<211> 2324 

<212> DNA 

<213> Mus musculus 

<400> 67 



aaaagccttg 


gaagaatcat 


ttgacctcgt 


agaaatgtca 


cacacttcaa 


gctattttta 


60 


agaatatatt 


tatttttaat 


tgcacatgca 


gaaggtgtat 


gtgcatatgt 


gattgtgggt 


120 


gcccacaaag 


gctataaaag 


gatggcgggt 


cacctggacc 


tggtcttaca 


gacatctgag 


180 


cagcctgagc 


caggaaatga 


acttggttct 


tgcaaaggaa 


gtaaacactc 


tttaatcctt 


240 


gggccttctc 


tcctgctctt 


atcctgttct 


tgtgaaaaag 


tgaatctata 


caagtgtagc 


300 


acctctccta 


gcctcatgtt 


ttttctgtgg 


ataaaaaaag 


cccgactcat 


catttttttg 


360 


atgtccttaa 


aattggattg 


cgggattggg 


caggatgaga 


aatttcctat 


attgtatctc 


420 


ctttggtctt 


cctaaaggat 


gctaaatcta 


tctgtgcaaa 


ccataactgg 


agttagacac 


480 


tttgctctgg 


ccttttcagg 


agctccgaat 


tctagaaaga 


atttaaaaga 


aaagattaaa 


540 


ttgtaattga 


atcaataaga 


gccagagcga 


taaactttta 


tatagagaaa 


tgagaattta 


600 


aaactgtagt 


aagaagtgac 


ccattaccct 


ccactattag 


ggtgatgaat 


taaccaatga 


660 


aagaattcgc 


taaagataaa 


ttatttagaa 


ctggattctt 


ctctgaccaa 


ttgtttaaaa 


720 


aaagaaaaag 


aaaaacacaa 


tactatcaca 


gctgaatgct 


cttctggtgc 


ttgcttaaac 


780 


ctgagttata 


caacagtgta 


aaaatgttgc 


cttcaggaac 


taaaatctat 


cactctgtat 


840 


tagatgatac 


ttcaatgact 


accccagcaa 


cattagatca 


aattagtgct 


gatctaagga 


900 


aagtattatg 


tgcttagttt 


tgctcacttt 


acagctgttg 


ttcttacctg 


cccatgggtg 


960 


tcacctgcag 


acttgccagc 


ctcgaacttt 


aaggccaacc 


caaagtcagc 


aatgcaagct 


1020 
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gtcagattgt ttttcaacag cacattttta cttttgatgt cccttcaaac ataaaaataa 1080 

aatatcttaa gcaagttatc tttcaaattt ttctactgaa tacatggaga atatgatata 1140 

aaaataaagt tacttaaaac taagaaactt tatttagagc aaagaagagt gcctcatctg 1200 

gttcaactta aagacaactc tagccaacat gtctatgctc tcatcaatag gtacacttag 1260 

aagtagtcat ttagttctac aaaaagatga aggtagagtc aaatgctcca gacaccaact 1320 

gacaaacatt ttaaagggcc ctataactaa taagttaggg gactcataaa catgaaagaa 1380 

aggagaggac tttctaaaga agtcatggtt tggtaacact caactacaca tctgctaaat 1440 

caatatgtta ataccgactt ttaaaacagc cattttaaaa ctgaatagaa agacaactca 1500 

atcccatatt attccaagga agttaaataa tagcaaacta aatttatttc ccagaaacaa 1560 

atatttttcc tttttaagac tgtaacagtc atttcactga tgagcactat ttctttactc 1620 

aagagctagc tgtctcctct tggttggtcc cttctacttt agctcaagga atgaatgtta 1680 

tatttattca tccaaaagac agagaaaata ctttatcctt aattctatca ctgtggattt 1740 

actaaaaatt gtctttaaac tcaaaatttt gagttaaaat catttctcct ttgaaataga 1800 

gaaaatacac ttttattctc aaacatactt tcagaagtta cacaaattaa aacatgggat 1860 

cattagactt tattaacaag attaacatgt acattaacat gtacatacca aaacacaccc 1920 

tagccaatct cacctaattt taaaaacttg aaaagtcttt gcctcttcaa atatgattgg 1980 

ataatfactg caaaaataaa ttttactcca agagttaaaa gtttaagtag taaacatttg 204 0 

agctggatga aaagacaata cagattttgt tctacctgtg agagattgca ggcttgtggc 2100 

catcttttaa gccaggtata tcctcatgta aatatgccaa tcctctagcc atggtttctg 2160 

caatatgaca aagttcattc caagagacca cattagcctt aagaaagtct gacagtgagc 2220 

cctaaagaaa ggagaggtgg gagggaaata cccataagct cttgatcaca atttcacata 2280 

aacagaaact tttctctgga tataaaagca acacattcct ttcc 2324 

<210> 68 

<211> 1378 

<212> DNA 

<213> Mus musculus 

<400> 68 

tttgttgagt tggtttttgt gtccccatcc tgcttcccag cctcttgcat cagtctccac 60 

cctttagcca tatgctgtac cgtagtcatt gttttcttcc taccgtgtac aacatctcac 120 
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cacgaaacag 


cagcatgggg 


ccagcacttg gggcccaaaa gcctgaagtc 


tgaagaggaa 


180 


ggaagcggca 


gtgcaagcgt 


cactgcagag gagccgggct atgcccagtc 


acagctcctg 


240 


gccttcacct 


cgaggatgaa 


gttgaaggcc acagagacag atgcacactg 


tgcacagaag 


300 


aaaacacagc 


agatgccact 


ttggagaggg caagagaaag gaataaactc 


tatttgataa 


360 


tttatattag 


gaggaaagag 


gactgaagat gttctgtgta ggaacagaag 


aacggacagc 


420 


atttctgtta 


gtcatttcct 


ggaaaagtaa tattttaatg ggaaattatg 


gaaacaatct 


480 


aaatgtccaa 


ttgctgtgct 


agggtaggga ttattttctg ggaggtgtgt 


gtgtgtgtgc 


540 


gcgcgtgtgt 


gtcccacaca 


tggctttcta ctctcccaga gggcaagggc 


taagtgtggg 


600 


aaatagtgtg 


gagcttagct 


gaaggacagc tgtagagcaa agcacatcca 


ggagccccag 


660 


gtgtcactgg 


ggtctgggca 


gccccgaaat gagatggggt aaggtattgc 


tcattgctct 


720 


tcagaaagag 


tgcttgaagc 


cccaggctta ctctattgct cttttagttt 


gacatggtat 


780 


ttggattttt 


tttctttttt 


cttttttttt ttgttttttt gtttttttgt 


tttgtttttt 


840 


gggttttggt 


ttttgttttg 


ttttgttttt gaaaggtctg aaagtgaaac 


ccttcactaa 


900 


atggcaaaag 


aaactgtctg 


tgctgctcca gtccctccct gtgtccatct 


ttgtcctctc 


960 


cctgtccctt 


ccctgctacc 


ttcacccagt ttgtgtatgt aagctctgca 


ttcagacagc 


1020 


tgcagcattc 


cgaggttgga 


aatgtcactg attcttgcac cttagaccag 


ccaacaggtt 


1080 


taccagttcc 


ctccctgcag 


tacccacttc ccagctatag ccccagtctg 


catgagaatt 


.1140 


tggtgtttgg 


aatgtttatg 


actctctcgg cggggttcct cgccttgcca 


tcctcactgt 


1200 


ggggtaatga 


agaaggggag 


gagaatcttc atcaactggt tttgtgtaat 


aaactttcgt 


1260 


gttttgtttt 


gatttgattt 


gatttggggt tgtttttccc ccctgtctgt 


ctgtctgtgc 


1320 


aagatctgca 


gctgctgaaa 


tcagctttgc ctttaattaa accgtgttct 


ctccaagc 


1378 


<210> 69 

<211> 3137 

<212> DNA 

<213> Mus musculus 








<400> 69 
attttaacct 


tgtaatatga 


ctaatttgaa tggtgtgaaa ttattttatc 


atgtaatgtc 


60 


attttcacct 


aaacagtaca 


attaagatat ttaaaactaa agatccaatc 


attttacaaa 


120 
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ttcctaaaaa 


taagtgatag 


attttgtata 


cactagtaca 


tcctccacat 


aaaaagtagt 


180 


tttgtgtatt 


tgaatgctca 


aaatttcttc 


aagatgggat 


taaaaatttt 


ttttcctagt 


240 


tctctgttat 


ctcattttta 


atcccttctc 


cctatgttct 


gcctagttac 


cgttttctat 


300 


ccaaattctg 


agtgtttcct 


aggtcttggg 


ccttatctat 


ttcttttctt 


tatgcagtct 


360 


gcttcggagc 


acatctgtga 


gcttagaatt 


atgtacaatt 


aaatgtatac 


caaaaccatt 


420 


cacactcccc 


tttcctctca 


ttgcatgcct 


tagtatctta 


aaggaaaata 


aaaatgcctc 


480 


agtttaaaat 


atatgagtta 


aaatcccttg 


agctgcagga 


tatgtgcaat 


aatgtctttc 


540 


accaaaatca 


agctgtggat 


tatttcctca 


cactcagagc 


ctaaatgttc 


attggagatt 


600 


taccattgca 


cttaagtgcc 


agaggaaatg 


acaaggtgac 


cattggaatt 


gttgtccaaa 


660 


cagaaaccat 


tgatatttaa 


gagaagattc 


ataaatctga 


agttatggtt 


ttctatgcca 


72 o' 


gacaaatcag 


ggaacattac 


acttaggcta 


ctggagtcaa 


ggtcatttac 


gatttgtcac 


780 


ccacttcaaa 


aattaatttt 


attttttaag 


attacaatat 


gattacatca 


tttcgccctt 


840 


cgattttctc 


cctctattgc 


ctctaatata 


gctccccctt 


tagccctgtt 


tcggattcat 


900 


atatatgtat 


atatatatat 


gtatgcacat 


atgtatacat 


atatatatat 


atacatatat 


960 


acatatatat 


gtatgtgtgt 


atatatgtat 


gtgtgtatat 


atgtatgtgt 


gtatatatgt 


1020 


atacatatac 


acacacacac 


atatcttctg 


ttacttgtat 


gtatgttttc 


agagctgacc 


1080 


atttggcact 


agatgaactg 


tcagagtgct 


cctccctggg 


gaggactgtt 


tcctccgcct 


1140 


ctcaggcttc 


cttagttgcc 


tgtggttctt 


tgtataggac 


taaggcattg 


tgggtttcct 


1200 


tgttcccttt 


ggcaagtctc 


ttgtttctct 


ctttgtttac 


ctcctattta 


agcagtcatg 


1260 


ttggtgaggc 


gttatgagtg 


cagcttggga 


cattcctgga 


gacatgaact 


ctgtcacttg 


1320 


ttttttaaaa 


agtcctttcc 


taaatatgca 


cccagcagct 


ctagggatga 


ctcattggtg 


1380 


aaatccatat 


cataatctaa 


ccatttttgt 


cactctaatg 


gcaataatgg 


tgagtttgtc 


1440 


tattttaatt 


atgtacattt 


tgcatttttt 


tttacattta 


tttattttgt 


atatgagggt 


1500 


tgagcaagca 


ttttccacaa 


cacagatgta 


aaaattggag 


gacaattttc 


agaagtctat 


1560 


tttttacctt 


ccactgtgat 


tattgacctc 


acgtcaggcc 


aggctttaat 


acataccaag 


1620 


tcctctgggc 


catcccaata 


gccctatttt 


actaattttt 


ttaaaggctt 


agcaaatttc 


1680 


aactacaaga 


aaatattttc 


tttatatttc 


ttatattttc 


tttacatttc 


tgttattgtt 


1740 



123/186 



wo 2005/005597 



PCT/US2003/027106 



gttattacag tttttaatat agccctctgt ggtgtgtgac tcttcctgga gagaaaggga 1800 

gaaatgacta gtcacctcag ttaggcaact gattagtcta aggaagtcac ccaagtccaa 1860 

cataggaaga gccctagtaa atctggtctg tttcaggtac ttcctgaagc gattaagatg 192 0 

tttacttcct gagtttaaag agtttcctta cagcatgtgt ttcaggatgt cacctgtttt 1980 

gacatcttgt atcttaagga tcttccttca agatggaagt gtgtatttca gagggaatta 2040 

gcacactcat ccagattgga ctctgaatcc ctgaagaggc tgcaggctac tgagcttctg 2100 

cttatttctg ccacgctgac cactacacac gatgccgtga aacaaaccta aggctttagg 2160 

catgttcagc aaacacttac caaatgtgct atatctccag cccccaaata agttctgcca 2220 

ccttggtggt cctatggatt gaactcaagt cagtaggcag tgcagcaggc ctatttatgg 2280 

actttcttgc tggtctccaa atgaattctg ataagaataa aaataagatt aggaaacaat 2340 

agtgtgcttg taatttctca ctgtgaagta tcctgggtct ggagtgtggt gtattttcgg 2400 

tagaaacatt tccctcccac tgtcttccct acctgactct attaatactc aaactgtgtc 2460 

ttagacatag ttatgtaaca ttatgcatta tagaaatcta agaatggtca agatttgggc 2520 

cttttcatca gtcccagtac ttagtatttc ggcaagtcct gctgggactc tcaaatgttc 2580 

ttgaatttaa ggtttatcgc attctatttt gtcacctcgt cagggccagc caccctcaac 2640 

tttttcttca gttcactgaa ccaaaagatt aactgctctc caagcatcca ctgcgacccc 2700 

aaacctgctg tacatcttca ccaaagtgat ctttatttat tgaatcgcag tacttctctt 2760 

ttcaatcctc ctgataactt tctatctggc ctaaaataaa attgaaactc tccacagcaa 2820 

agccactgag cacacagcat cactcaatat actttctgtc aactttctgc ttaatctctt 2880 

cttatttctc agagagaaaa aaaatgaaat tgttattttg tttgattctt ttatcttcgc 2940 

cctttggttg ttttctgtcc cctctaggat attacaaact tgtaactaaa cttagcccgt 3000 
gatatctaga aatgaataaa actcctacac acatcagaag taagcacagg gtcgcctatc 3060 
tgaaaacagg accaaagagt ctttccaact tatttggaaa aacaaagctg agagcagtgg 3120 
tcagtatcat tttctgc 3137 

<210> 70 

<2ll> 2795 

<212> DNA 

<213> Mus musculus 
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<400> 70 
tttcagtgtt 


gttcaaatca 


cacacattcc 


attatttgat 


tttagagagc 


acagtagata 


60 


ggtttcagct 


tccctagaga 


attctagaag 


gttcttggat 


ggttacttca 


aacgaagaat 


120 


actgctgtac 


ctgagttgaa 


cagctaggga 


tttcctatat 


taacctggct 


tcggcacata 


180 


ttaaaatgac 


ctgacttcta 


tccaagggaa 


attcaaggta 


ttatgtcaag 


aatctaataa 


240 


atacatcttt 


actatcatgt 


cagagctaat 


agcacctctc 


taattgagtc 


tctctaatgt 


300 


ttgcctcgga 


agcttggctt 


tttgtttgtt 


tatttgtttg 


aattgtcttt 


ttttttttct 


360 


gttttttttt 


ttttttgact 


tgtggtaaat 


aacgtgccct 


tggtttgtct 


tcctagacat 


420 


tcgcagtcat 


gagtttcatt 


tgtattaaag 


agattttatt 


atttactgag 


gtggtaagca 


480 


cagatctcac 


atatttttct 


ctcactgctt 


gaaaggccct 


gctcaggcca 


ctttaaaaat 


540 


aaatatatta 


aaatttaccc 


tagcattgcg 


aatgagtgat 


gtcatccaga 


gtgttactag 


600 


ggatggcaga 


cagtcttctc 


tgaacacaaa 


gcttgacaag 


aagcctcaca 


gtgagagcct 


660 


cagcctgaag 


cattcctgtc 


agaacctcac 


actgcctttg 


ttttcgcctt 


ggccgttctt 


720 


tgtgcactgt 


gcatattcag 


ccactgggtg 


gcagtatgtg 


cttaagaata 


tttaatgttt 


780 


gccaccttgt 


tttgacagaa 


tttaagaagt 


caggtttttg 


tttgtttgtt 


tgtttgtttg 


840 


tttttgtttg 


tttgttttgt 


tttttgcttt 


tcaatcttgc 


aaaatgacca 


ttgtattgaa 


900 


tataagaaat 


acaagtattt 


tctgaatgca 


acagcctact 


cataaagtgc 


ctgaaaatac 


960 


ttcttttaaa 


aatatattta 


atttgagttt 


tacagtttcc 


gaggattgaa 


ttcatgacca 


1020 


ccatggtggg 


agaccatgag 


gcagagagaa 


ctaacaggga 


atggggtggg 


cattggcaca 


1080 


cttccttcaa 


caagaacaca 


tatcctaatc 


cctcccaaac 


agttccacct 


cctggggacc 


1140 


aaacgtctga 


atgtacatgc 


ctatggaagt 


cactcttgtt 


caaatctcca 


cagttaaaat 


1200 


gtatggtgat 


gctctgcttg 


gcttttctca 


aagatgttgg 


aacccagcta 


actttgtagg 


1260 


cctcctctct 


aagaagcttc 


tattgcagaa 


ttgggagtct 


gtaaattcta 


tcagaacacc 


1320 


acacaccttc 


tccaactcat 


gggagtggtt 


ttctgacacc 


aggtcctagt 


actgacagga 


1380 


gacacacaca 


cacacacaca 


caaacaccca 


tccctaactc 


agacgtgatg 


ttcactgatg 


1440 


accaatcgca 


aatggaaaat 


cagttcttgc 


taatgaatca 


cagtggaatc 


acccataagg 


1500 


ccagccacca 


tgcccagctg 


tagctggcca 


acacaaaaca 


aaggaattgg 


cattgttggg 


1560 


ggttctttga 


ctcagaaggt 


tttaacaagg 


ctggggcatt 


gtttgtttta 


ttaccttgta 


1620 
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ggtcctttgc 


ataacatcag 


agcttcatgt 


tttctgtgtg cataaacatg tgtgtctctg 


1680 


catctacatg 


tgtatctcac 


gctctctctt 


cggctctttt cttctgtttg tgtctttgtt 


1740 


ttgtcctgta 


tctgcttttg 


tatatacata 


tatacaattt 


tgtgtgtata 


tatatatata 


1800 


tgtatatata 


tatatatgcg 


tgtgtgtgta 


tacatatata 


tgtgtgcgtg 


cgcgtgtgca 


1860 


cacacacaca 


cacacacaca 


cacagaatgc 


attcggtttg 


cctatctggc 


ctctggttct 


1920 


tggccactca 


agcagtgtta 


gggatgggct 


ccattgcata 


ggctttaagt 


taaatcagac 


1980 


gagctttctg 


ccacaattgc 


accggtatat 


cttgcaggag aatcaacatt gtagatcaca 


2040 


gacttggtag 


ctggatttgg 


atttatcttt 


ctcctttcat 


ggcatgcagg gtagcctcca 


2100 


gcaccatgaa 


cactagtctc 


taaagctgag 


agctgtgcgg caggctccag tgcaacttct 


2160 


ttgtgcttga 


taagccttgt 


aggtgttgtc 


ttcagccctg gtgctcttac 


ctctctgtct 


2220 


gttgcctgat 


gatggatttt 


tctccttcca 


cgatgttccc 


caaacgtcaa 


aatacagcca 


2280 


cagtgatatc 


aggagctaag 


ccattagtct 


agcaccacaa 


ctttctgata 


ttttctgtct 


2340 


taccctaggc 


atgatacatg 


gatacgtgca 


gatgttgaat 


gtccatgcac 


ttaaattaat 


2400 


tctatcccca 


aagcagaatg 


agaccaatta 


atttgttctg 


aatataattt 


ctcttcagtt 


2460 


aagacaatgc 


ttcccgacct 


tcctcatgct 


tcaacccttt 


aatacagttc 


tccatgttat 


2520 


ggtgacctcc 


ccatcataaa 


attatttcat 


tgctacttca 


tacctgtaac 


tttgctactg 


2580 


ctataaatca 


taatataaat 


atctgatatg 


caggatatct 


gatatacaat 


ttccaaaggg 


2640 


cttacaaccc 


atagtctgag 


aacagcttaa 


ctcaaaacac 


tataaacatg 


gaattatcaa 


2700 


agtgcatatt 


tttctctttt 


ggttgtttgt 


ggatttttcc 


tatgttttac 


ttaataaaaa 


2760 


taatatttac 


tcttaaatca 


tgttttatct 


ttctc 






2795 



<210> 71 
<211> 2554 

<212> DNA 

<213> Mus musculus 

<400> 71 

aaaaaggttg tctctaatca ttgtatacag aattctggac aatgaacctt gcataattaa 60 

cagattggaa cacacagaaa gtaaggcttc aaagaatagt attaaaacat taggttttat 120 

gaagtggagt cattctattg ttcaagaccc tgtcattgaa taatgttgga catggctcag 180 
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aattttatga tgtagaagaa 


ttcaaatgcc 


aaatgtgcat 


ttgatctttt 


aagagggtga 


240 . 


atagtaagtt gttattgtga 


ttgcgctggc 


cctttaagaa 


atgagtgttt 


ggtaagttta 


300 


tgtgtaacac ttgccatgat 


tttggttatg 


aataggtcgc 


aaagatgttc 


ctcgtgctgg 


360 


cagatggaag ggcagcaaga 


taggtggctg 


ggaggaaagc 


cggaggcgga 


gttgtgctag 


420 


aaagaacata gcaaattcaa 


cagatggagg 


atgatagatc 


agttggcaaa 


aagaggtgct 


480 


acaggagaac aggggaagct 


agtgaagaca 


tggatggata 


gacagccagg 


agtgggctaa 


540 


atgaaccagt aataaaagaa 


atagagcagt 


acaggggccc 


caaagttcaa 


gaacaaatag 


600 


ataaatatca aagacgacaa 


ctacaaaatt 


cctctgtatc 


tgagcttggg 


aaaatatctt 


660 


cctttaaaat gtcttttatt 


ataatctttc 


aaaaatgctt 


taaatgaaca 


aagatctcat 


720 


atacggaaaa tctttattgt 


aaaacttaga 


gattcatact 


atggaggaaa 


tttattctgt 


780 


aaaaaaatga tcatggatgt 


aaagttaatc 


ttattcctta 


gttttgtgtt 


ttctgcaaat 


840 


ttggattttt gacagctttt 


attatgtaaa 


tatgtcagga 


cctggctgga 


aaataatgct 


900 


gaggtatatg gtccacaggg 


atgagaagtt 


tgatttcttt 


caatagagta 


gtataaaaga 


960 


atggaaaatg gaaatacagc 


attatcctta 


caaaatgtcc 


tgttttctct 


gcttaccagg 


1020 


agtgacaata gctatgtgct 


cattaacctt 


aagtaagtaa 


atatgttgtt 


tgtcctgttt 


1080 


ttgccttcag cctcaacttc 


ccatataacc 


accatttttc 


ttaggaaata 


gtcttacatg 


1140 


aaaaccctta gacagttaag 


taacataacc 


aagttccctt 


tgattaataa 


ctgttactat 


1200 


agattaatat ataaatgaca 


ggctcactaa 


tatatgcaga 


aagagggttg 


gcttaatata 


1260 


cccattataa ctgaccttta 


aatacctgca 


tgggttgcaa 


agctactccc 


attgccccca 


1320 


tatacatgaa tgtagcacga 


gcacttttga 


tatataaaaa 


ctggagccaa 


tctctgaagc 


1380 


aaaggaacac tttatggaaa 


actccacagg 


cactatggca 


gacgtgatgg 


taggcactat 


1440 


ggcagacatg atggtaggca 


ctatggcaga 


catgatggta 


ggcactatgg 


caggcattat 


1500 


tgaagacaga cactatggta 


ggtccccagg 


gtaaataaag 


gcagagaaac 


aatcagttca 


1560 


ctcagagtcc tgggagagag 


tcattgttct 


tcagtgtctc 


aggtacaatt 


atgaaatgca 


1620 


atggtcttgt ctatctttgt 


gagtccttta 


cttacgtgtg 


cctgtgtgct 


tgtgtgtgtg 


1680 


gatatgtgcg ggtgcatcca 


tgtgcatgtg 


tgtccctatg 


catgtgtgtg 


tgcatgtgtg 


1740 


tctgtgtgtg tgtgtgtgtc 


cgtgtgtgtg 


cctgtgagca 


cacataggtg 


tgtgaatgca 


1800 
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ggcacccatg 


tgcaaataga 


agttagaagt 


taattcttag gagtagaatg ttccttttac 


1860 


catgagttcc 


aggggctgaa 


cttggcttgt 


cttgcacaaa aaacactttt agtctctgag 


1920 


ccatcttgtt 


gactaaaaga 


tgtgggacag 


aagggcccca aactcctatt ttctgcttct 


1980 


atgccctatc 


aacacgtgcc 


accatgaatc 


ttttttagat taaggacctc actagattcc 


2040 


tcactcagaa 


agtacttatt 


aggaacttcc 


ttgttcagtt gctctgtaga cagaagtatt 


2100 


tcactgtttt 


cattgacaaa 


atgggtcaac 


tgaaatggga aaaaaagcaa agaaaattta 


2160 


ctagacaact 


gcagctctca 


gaaagacata 


gagtctccag catagtgact atacaattgc 


2220 


tgtgactacc 


attagtaatc 


agccaagaaa 


gcttttttca atatgaagat taaaaggtgg 


2280 


gtaagagaga 


ttattcatta 


tagttcatga 


attaacaatt cacacttcag agccatgtat 


2340 


tcaatgaaca 


caacttgcaa 


gtagcaatgg 


catgttctgg gctcagcact aagatactac 


2400 


aggttctctc 


ctagatccat 


gtaactcttc 


taaattcctt tgacagtttt gagacacctt 


2460 


caactgacca 


tgtatgaatc 


tacccatgat 


attctgcatc cttttaactc aaaggtttac 


2520 


ctttctttca 


ttaaaccaac 


tgtccccccc 


cccc 


2554 


<210> 72 

<211> 3154 

<212> DNA 

<213> Mus mus cuius 








<400> 72 
ttgtctcagc 


ggaatggact 


tctcttaata 


taaaggtttc cagaatactc tgtcagaatg 


60 


aaggacacat 


ttccaataga 


gctgagtatt 


actcagtttt aactaacagc cctagacgtg 


120 


ggatagcagc 


tgctcggatg 


cgcagaagct 


gaataggatg aataagagaa accgctttgt 


180 


aatatattct 


cctccaggtt 


cttcctgcaa 


ggatctggga tcaatcctct tgtttctgta 


240 


acctcccgcc 


tcatcaccga 


ggagacctgc 


acagattcac tgtcttgtgg aatctccatg 


300 


gcgacatgca 


cttcagttct 


ttgaaaacat 


gagttatagt ggtatttggg gcttgcttat 


360 


tcgtttgttt 


gtttatttat 


ttatttattt 


atttatttat ttatttattt atttatttat 


420 


ttatttatgg 


tactccaggg 


agatgaagca 


attggaaagg attcattcct taaccagctt 


480 


ccattgccag 


agataaacct 


caagagtttt 


aacccagcag gttaccctca tcttgaaaag 


540 


gtcttggcca 


atgcatatcc 


tgattttatg 


gttaagacac acagtccttg aaggtttgta 


600 


aacaggagca 


caatacactg 


gaatcctgca 


tctctgagta taagtctggg gtatgtgcgc 


660 
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ttgcatattc ttatcccttg tgaaagagga aataattaga aaatgagagt aaagagaaaa 720 

caaaactttc tttaaaaatg cctgtattgc tctttgggaa attatttagg ttgtctccaa 780 

catgaaacat ccatgtgttt aatatatgac cttgaaaggc atgaaggaga gaataaatgt 840 

aaacttcaaa tttctgccat cagaaaccaa aggctttggg gtatttactt ctagccacaa 900 

tttatctgca tacttttaaa atacacttat gctgtggtct tcaatattgg tggttttttt 960 

aatttattaa atcaaaatca tttttaaatg cttctagatt gtcctatcaa aactggggag 102 0 

gggcagggca ctcacaatga cagatcgggt aagctacaaa atcacttggc aatacagaaa 1080 

ggctgacact tcagggctga gcagagccac aaatggaatg cgtctattgg tacttcttcc 1140 

ctttttaggg tcatttaaat tctgcaggta taaatggctt atgcaggtag tttgttctga 1200 

gcagaggcta agagagacac ccaagaatca tgtatcatga attgtctaag tcagaatgga 1260 

aggactagaa agcagcaacg tgttgggaca aacagaaaaa tacttgactg ggtatagtga 1320 

tggtgtgccc ttcagtggta ctttctttgg acacctccga agggtttcaa acaggaagga 1380 

cccaattctc cggaagacgt gttttgcctg gggatgttta tctcagaacc tgagttctgt 1440 

acagttctgt atctccgtta cctgagccag gcacaaactt gggtgaaatc ctgtagtgat 1500 

cttgaggtgg gtgcctctgc tttgttctat ggccgaagag agatagtaag accatagcaa 1560 

gccaagtttt atttggaaga tcggtagtga aaacagtttg taatctgaac tgatgaagca 1620 

ttatggctgc acaaataact aaaaatatag ggcctagaaa atcagtacca tggtctccag 1680 

ctagggttac tcacatgaca ttatttaatg atgagaagct gtgtattcca caggcattgg 1740 

taggcaatat aattagaaga atgctttgcc agcgatacaa ctaggagtga actaatttat 1800 

gatttatttt taagtgacta ggacccttaa tttatctacc aggacatcac tgaggtaaaa 1860 

ataaaaataa taataataat aaattttaaa ataagcattg caaatgcagt ggcattgaag 1920 

tactaaacag gaacttaaaa agcaacagaa acaattatgg ctaatgaagt agattaattg 1980 

actaattggt gtttccatgt gtaattgttc ctgcaattat gattcacttg ccgctttcta 2040 

tactgctatt ctctaaacat atcctgacat ggagtggagg tagaagacat tggcagtttg 2100 

ggagatctat tgagcaaagg atttttacct taaagtgaga aggcaggaca tcaaagcaca 2160 

cctgacgaga gctgaggaaa ctgtactcag catcatctcc tcgactttga ccaacctgat 2220 

ggactttcat cttctaatgg gcacttatct aatctataaa atgacagggt atggtttcag 2280 
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ccttaaaaca ggattagcca gtccaaatca caactgcccg tgcctttaac gttttaagga 2340 



gaaatgatgt attgatcatg cacgagattc tattttcatt ccagacaaat tcaccttcca 2400 

tgcaaataga cgcctggcgg ctgttccagg gtgctgtgta ggtaaataac tgtgggggac 2460 

actgtttccc taagaattac aacaacattt gaaaacaaaa acatgttgct ttccccagga 2520 

atgtaaatct cagactgtgc tcaagtctga agagaaatat gatgggccaa ggaaaaataa 2580 

actctaaaaa taagattcaa acctaaacga tgtgcacagt aactagactt actcaaaaca 2640 

cactctagaa atacattcct ttttattttt taactgaaaa taattttttg tacagtatat 2700 

tttgatcatc atttttcctc cctcatctcc tccaagatcc tccccctacc tcctaacctg 2760 

tccaaatcca cacctttccc tctgtcttta ttgatggtct gtttgatttc atgatgccaa 2820 

agatttaatt gtgtgcttgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt 2880 

gtgtgtgtgt gtgtgtggtg agactatgcc acatgtgccc agatgcccat ggaggttgga 2940 

agaagacatc ttcctgtaac tggagttgtg ggtgccagca acttaagtcc cttggaagaa 3000 

tataccatgg gcccttaact ctaaaatgta atgttttagt tatttgagaa gttcatgcaa 3060 

tgtattttga ccacattcag gccccactcc tttcctaact cttctcagac caattctcaa 3120 

ctgcacgtta cttccaatat catgttctct tttt 3154 

<210> 73 

<211> 1616 

<212> DNA 

<213> Mus musculus 

<400> 73 

tttctagtaa taatagttca tcctcaatca aatgttgtcc tagacttcct ggaagtcatt 60 

ggtatttgtg caaacaaaac cagagagcca cactgtgcta ggtggccctg gaggatatga 120 

agacatacga tggccctggc accactcttt ctgccaccca aggtgctctt tcttgtcatt 180 

aacagaatct aaacaggcaa ggagaactac tggaaaattt accatagaca ggtgaaggtt 240 

ttcatacaaa tctgttatgt agcagatggt ttatgtctaa ataagttctt ctaatgtatt 3 00 

tgcatttttg aaaataggct acaaagtaac ttaatgtgat aactggattt tttataaagc 360 

aaaataaatc tatgttggaa atacacaaag acaattatgt atgttactag caagtcttat 420 

tttgtacaaa ttacagaata cccattttta aaaaaacata cacatgacct ttgatggtac 480 
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atttctatgc 


tggcaaatat 


atttctgata 


caaaggcact 


ttgaaaaccc 


aatatggttt 


540 


ggtttgtaag 


tttgcaaggt 


acaaactatt 


tctacaaaca 


tatgattaca 


ctgaagaacg 


600 


tcaacaactg 


tgacattaca 


tagaaaacag 


acttcttttt 


gtaaaaacaa 


cactggtttt 


660 


atgggaataa 


aagcagtggt 


ttccagcagc 


tttagtgaag 


acattcagaa 


ggttgaaggc 


720 


ctccagtctg 


aatttctcac 


tggctgggtg 


ggatgataag 


gatggggtga 


gtgtgatcag 


780 


caagaaccct 


cagcggggaa 


agataaccca 


ataccctaga 


aggaacccag 


aaccctcagc 


840 


agtcagaatg 


gcttgaaggg 


aaagcacaca 


ggcagcagtc 


tgggcaggat 


ccggctgtga 


900 


caccgctcga 


acacgcggct 


gactctacct 


gcgtccttgg 


tgcttctgag 


ttgtagtggt 


960 


cgtgtctctg 


tttcagagca 


ctgggtagta 


caatgctttg 


gactgcacag 


tactgggaaa 


1020 


gtttggtcct 


caatcagttt 


ccttcctttg 


actacattgc 


cgcaaagtca 


atctgtacat 


1080 


aaagctgtct 


cactccagcc 


tggaaagtta 


gaaagaaaaa 


ttacaacatc 


aaacctgaaa 


1140 


acctgaaaca 


gatgaacatt 


ctagaaaagt 


ggtaggaaag 


caaacaacac 


ctcatttgtc 


1200 


tttaaaaatg 


gaattatttt 


tgtcttttgc 


ttctaagaaa 


ctatccattt 


tcattgtaag 


1260 


tcgtcaaact 


actacaggaa 


aaaaaaatgt 


gtgtcattag 


ccacatggcc 


tgcgtatgcg 


1320 


cccttagatg 


ggtttattta 


tgtgccgcct 


ctgttagagg 


acacgacagt 


gctggattag 


1380 


gcagccaccc 


attcaaacca 


gctcaccacg 


ggctatatct 


tcccatggct 


tttaaaaatc 


1440 


acaagtggca 


actgcttact 


tattttacac 


tccgactcct 


tcagtcttgg 


ccatgagcta 


1500 


gctactcaag 


ggcactactg 


tctaaattgc 


accctatttc 


ttctgtatta 


tcaattttat 


1560 


tctctcaact 


aggtcatttt 


tattagtgat 


caaacacact 


ataatttctt 


tcactg 


1616 



<210> 74 

<211> 447 

<212> DNA 

<213> Mus musculus 

<400> 74 



taatgtggca caacgtgggg 


ctgaccctgc 


tggtgttcgt 


ggccacgctg 


ctgatcgtcc 


60 


tgttgctgat ggtgtgcggt 


tggtattttg 


tatggcatct 


atttttatct 


aaattcaagt 


120 


ttcttcggga gcttgtggga 


gacacaggat 


cccaggaagg 


agataatgag 


cagccttcag 


180 


ggtctgaaac agaagaagac 


ccttcggctt 


caccacagaa 


gatcagatct 


gctcgccaga 


240 


gaaggccacc tgttgacgcc 


ggccactgag 


cagacaaagc 


agtgtcttag 


agtgtgggcc 


300 
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aaggcagtca cgagcctctg tccttagtgg cgacctagct 


ttgaaagtta ctaagtgacc 


360 


gaggaacatt tgcaattgga tttatatcca gttttaaaaa aaaaagattt acacgtaagc 


420 


catatagaaa taaagggaat ttaaacc 






447 


<211> 2706 

<213> Mus musculus 








<400> 75 

ggactgcctg caaagactga catttattta 


ctttcatggg 




60 


tc'bc't'bc'b g'tta.ca.tg'cc tgg'ctgg'g's.g 






120 








180 


t3.'b'b3.9't'tC3, Ct^QQB.QB.B.B.C QB.D.^^CD.B.B.C 


tttttaagag 


tagacctttt taatatccat 


240 


3,B.^gC3,B.3.^C ^QBg3.B.^QC^ t a. 1 3.9^9^3^^3.0 




aatctgtctg aagtgttggc 




ccatttcctg tacttctata ctgtgtgtta 




tagtaggtac ttcacgaatg 


360 


ttttacctcc tgaggtcaca gtaatttgtc 




ttcttagtgg aaatgcctac 


420 


cacattttgt acctagctag ctacttccac 






480 


aaatacttaa ctcaggctcc aaatgttcca 




agcttcttgc acagctccct 


540 


gttttcccta agacccaatc ttcccaagat 




ggagaacaaa aaggcaggaa 


600 


ccctactttt gcctattaat gtggctaatg 


gagtttgtga 




660 


tccacacttt ggacaaggat aggtcaaagt 


ttagctccct 




720 


cattgtgggg ggtgcatttt gaaattttga 


cccatatcct 


ttggaggcct aggagagctc 


780 


actgtgtcca gagaatttta gttcccaagg 


cctataattg 


ttgaaaagct agggaaactg 


840 


gcactattgg ggtaggagct gataccctgc 


cttttcaggt 


tcccaggcag gcctctagta 


900 


tcctgtcctt gctcaggagt tatctttacc 


cctgaaatca 


aatactacac atagcaccct 


960 


gtaaaactgc ttccttgtaa gcttcaccga 


gcttgctaag 


taaacatccg cccatgcttc 


1020 


tggaacagaa ttcatttctg cttcctggtc 


atctcttgtt 


tttctgttct cagtgcccca 


1080 


ccgaatcatt ttaatggaag aaagacctag 


aattagtcag 


aggcacctaa gatgattgct 


1140 


ttaatcttag taaaattctg aggaggaaga 


aaagttttcc 


tacagagtgt caccatactt 


1200 
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tgcaagtcac 


cactgctgtc 


atggccattt ctctgtgaag actttagcac agtcttgctg 


1260 


tgttgcagac 


cacagtggcc 


tcactcctca agttctatgg ctatgtttgt ccctacttta 


1320 


ttcccatgaa 


tctcttgctt 


gtttctcatg atttacctct tgtcacttct acctaccatg 


1380 


acattcctat 


tgccatggca 


ctgagtagcc acatcattca ccaagttcat gagattttga 


1440 


gctcactcta 


tgctgatcct 


acttccttct agttccccct cctgaaagag aaggctcctt 


1500 


ggcagagctc tttgtttatt 


tgatcctggg ctcatgggta aatctggaat gtcctgacat 


1560 


gaacttaggc 


atgttttttg 


ttgttgtttt tgtttgtccc cctccccctg gagaaggaaa 


1620 


ctggaaaatg gagatctgaa 


taagcagagg accattctgg gggccttgcc ttaagtgaca 


1680 


ttaaagccat 


atgcctttgc 


tcatcagacc agtaggtgac ttctttattg gcatacgcta 


1740 


ccctgcagcc 


cagaaggtgt 


aaatttgaag cactttcccc tccctgggat cctggcacaa 


1800 


atgccctggg 


cttcttttaa 


aaccagggaa taccaggttg tcttccaggt ggtccttctc 


1860 


tctgatggct 


gaaatattag 


ccagggtggt tgtgtaaagg ctgtatagga tagttggtat 


1920 


agaagaggtg gttgcctgac 


ctcttgttca tctttctgaa agttgcaaga tgcattttat 


1980 


gaatcagctt 


attgcctcag 


gacccaaaca catctaaaac atctagctag aggctggtgt 


2040 


agaaaacatt 


tatatatata 


ataagttata tatataatag ttggaacatg agccttgttg 


2100 


aatgtgtctg gttatgtagg 


gttcattggg ttagtatctg agctattctt tgtagcattg 


2160 


\tcaggaaagg ggatgttaga 


agctgtgctt tgctgctaag agttgtcact accctaaagt 


2220 


aacttgtacc 


tggctctgaa 


gcatcagtgg aactttcctc ataagagttt ccaggtatgt 


2280 


ggatgtgtta aagtattgct 


caaggactgg gggttcatct ggttacccat cctagggaag 


2340 


aaacaggtat 


tggctatgtt 


aggtctaggc ttggccatca ttctgttgtc ttttttgttg 


2400 


tcagaaggat 


cttgggaaga 


aatgactcag caacaaatct tagatgtgtg gccttcctgc 


2460 


ctttggggaa 


gtaattgaac . 


atacctaact agattctaaa attttaaaca ctgaatgaat 


2520 


agacgtacat 


tactctggat 


ttcttcaggc tgctgtttgg tttattcttt aaaatacctg 


2580 


atctgataga 


aggatctcat 


ttgtggagtt agcaaaaaaa atgttttaat tatgcttttt 


2640 


agccttagga 


aattcaggat 


ttaaaagtcc tgatttcatg ccccaaaaca aaacaaacaa 


2700 


acaagc 






2706 
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<210> 76 

<211> 1902 

<212> DNA 

<213> Mus mus cuius 

<400> 76 

gagacatccg tgggagcact gacagctttc acaccaaaca ctaaccagga gtcagctctg 60 

ggtagagctg cccacgaagg aagagtgcct aagtcagccg tgctcatctt gaccctcacg 120 

gcaggggaga cggaagcctg aagccccccc tccctgtaga ttcacttccc accagggaac 180 

ttaccaatat ttcctccaac ttcataaaca aggaaggtga ccggcttcgt gttactgtaa 240 

gacagagaaa aacgtccact ggactctgct acacaaggca aggccacccc aggtctggaa 300 

ggagcatcct cagggatttg tggtcctagt aggaagggcc tgaggactct tctgagagaa 360 

caggagccaa tgaaagagac cagcctccag aagcagagag gggaagggag aggcagggag 420 

gtagtcagtc cttgggaaag atcagcccca ggtgctctgc cacagggcat cagaagtggg 480 

cttcaggggg ttcggtcagg gtgacaagtg gagcgagcac cctgctgtga gcaccggctt 540 

ctgcatcaag gaggaatgtg tatgactctg tgtgggttat aacaatacaa gtgtataatt 600 

atccgatgat gattataggt atcctggatt tggatccagg atctacccct gactagcttg 660 

tgtctttgga catgcctttg gcctttgatt tctcattagt atcagaagga attgggtcag 720 

acacttggta agggctgata ctctgggggt ctttgtgcca accccaggaa ggaagctccc 780 

aaggactccg gctgctactc cttgtgcctg ggaatgacag ctacgtatca gccccctgat 840 

gcatttggcc acaccgtttc ttgcttcctt aggacctctg cacatgcagg gttctgtgcc 900 

cggaagcctt gtttgctcct caggttacag taggcacctc agtgcaattc atgttatata 960 

agacaatata gtgtttaact tgactcagct acttaattta tgtctgtctc ttttctcatg 1020 

gcatgagcct gtaaaggcca gaacctttgt tgttgttgtt gttgtttttg agacagggtt 1080 

tctctgtata gccctggctg tcctacagct cactttgtag accaggctgg ccttgaactc 114 0 

agaaatccgc ctgcctttgc ctcccaagtg ctgggattaa aggcatgcac caccaacccc 1200 

cagctggttt ttttttttta aatatttatt tctattcttt tgtttgtctg tttgtttgtt 1260 

tcttccttct tgtctgtctt ccttttgttt ctttcctttt ttttcatttt tcttctcttc 1320 

tcttttcttt ttcctttctt tcttttcttc cttccttttt ttcttccttc cttccttcct 1380 

tcctttcttt ctttctttct ttctttcttt ctttctttct ttctttcttt ctttctttcc 1440 
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ttctttcttc 


ccttctttct ttcttatttt ttttgacaca tagtttctct atttaacagc 


1500 


cctggctgtc caggaacttg ctttttagac caggctggcc tgaaactcag aaatctgcct 


1560 


gcctctgcct 


tcccaagtgc tgggaaggcg tgcgccacca cagctggcta tttatgttta 


1620 


gttacgagtt 


gggggtggtg tacatgcaca tgtgtgtggg tgtctgtgga ggccagattg 


1680 


ctgtgtaccc 


tggagctgga gtcacaggtg gtcatgggct gcctgactag ggcagctggg 


1740 


aaccgaactc 


caatcctctg caacagtggt cagtgctctt aaccactggg ctgtctctcc 


1800 


agcccctctg ttaaggcctt tcattgccat tttagcattc catgcaccgt ctgatctaca 


1860 


gtagctattt 


attaatatat attaaaatca gtaatgaatg gt 




<210> 77 






<211> 1086 






<212> DNA 






<213> Mus itiusculus 




<400> 77 






ggacagtaga 


tttccgagag gaaaggagaa gagaagatag ggatggaagc agagaaagaa 


60 


aggtttaggg 


agagaactgc gaagggatgg cagggggtgg tcacaaaaag agaaagggag 


120 


aagagatcag 


gctaatctta cccaatcgat ccagcgatca ggtgatcgtt tatgcggtat 


180 


caggacacga 


aaatggctcg acaatgcaat ccacaacaaa gctatttctt tctctagttt 


240 


agcttggtaa 


ctgctggaat caaatgcaag gcagacatgg cacagacggc cactcgtgca 


300 


cccctcctac 


cccaccccca gtggaccagg gacagttctc ggactgccct aacatctagc 


360 


ttcttgtagc 


atggtcttta gcgcaggggc tcagtactga aagcaattgc cttaaacaca 


420 


ggctgcatga 


tttaacgagg aagtgggacc taggaaggct gctgccgcct aggtaaccca 


480 


tcatcagcct 


cttggtcatc ttgctaaccc tccaaagcat agcgccctcg aaaactctga 


540 


gtgactacga 


ccttcaacca ttcacctccc agcccctgtc tatgcaaaaa aaatcgctga 


600 


ggattagatc 


aatgcaatgg gaccggtcct gctggtttat ttgggctgtg tgttaagtcg 


660 


attttcttac 


tatgcctgga tcagtttctg caagggaaag gattccaagg aagtgctagt 


720 


gaaatgcact 


gatggaatac tattaatcct aatgataaca ttcccagtgc aggtaaactc 


780 


tgaaaagaac 


tgtgagtggg aagatcctgc aggatttccg aagcgctgct ctcctgaatt 


840 


ctcattggat 


catccccagc agcctaaatc agtgctgctg ctctctgaag atgactacag 


900 


cctagctcac 


tagttgtcta ctcctgttta caacacatcc tttcacttct ctttggagat 


960 
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ctgtttgtct tattcatcct gttctcttgg gttacccagc ttcctcactt tgtgggtact 102 0 
tccagtttgg tacatttgtt tgcctgagga ttatcataag aaaggttccc actgttaacc 1080 
attctc ' 1086 

<210> 78 

<211> 3701 

<212> DNA 

<213> Mus musculus 

<400> 78 



gttctcgtct 


gctgaggtat 


gactctgagg 


tatgactcca 


gccacttgtt 


ttcctattct 


60 


tgcaatagtg 


tgatgattta 


gtaaagcatc 


atagaagaaa 


atgtattcac 


tccctttaag 


120 


ttaggcaagt 


tgaaatttta 


ttaaattatc 


tatcttatta 


taaattacta 


aaatattctg 


180 


ctaaaatatt 


ggctttgaaa 


gatattactg 


taatacgttg 


ctcaaaactc 


caaggttatt 


240 


gggaaattag 


aatattctaa 


tgggagattg 


tgtcatactt 


tattcctttt 


atgtctgtat 


300 


cttactttct 


tctgctgttg 


gaataagatt 


ttttttcatt 


gccctgtcat 


ttcatatata 


360 


gattatcata 


aataaattgg 


accatttaat 


atttctttcc 


ctaagaaaag 


attgaaccac 


420 


ttaactcact 


ggtaatttgt 


tataataaga 


aatgaactca 


ttttaaattt 


attcttttta 


480 


aaatccaagc 


actctcttcg 


aaacaattta 


tctctggcta 


tcgttacaca 


ggacttttct 


540 


ccagccatgc 


tgtcgccatc 


atgacttcat 


catgacttca 


tcatgacttc 


atcatgactt 


600 


ctcttgcaaa 


agatctccat 


ggctaggctt 


attgcccgct 


tggctttccc 


tcatccaagg 


660 


aaaaggtgga 


agggttaaag 


cgcccttatc 


caaatatagg 


gccctctgcc 


cagccagctg 


720 


acggcaatct 


accctaattg 


cacgtggtct 


ctgggaggag 


ccagaagtga 


gtcagccggg 


780 


aagaatgagg 


gggaaagctc 


tgctctaccc 


cccccccgcc 


catccatgct 


ggtttcagac 


840 


cctctagtgt 


ggcttcttta 


tggtctccca 


agttcctccc 


cattatcact 


tactgattac 


900 


tctaaattgt 


ttagactcat 


tggttccccc 


agtgaactct 


ggtgttattg 


ttctgtggtt 


960 


tgtcatgggg 


acttaggagg 


aaggagctct 


cgcgcacaca 


cgtgcactcg 


cgcgcgcgca 


1020 


cacacacaca 


cacacacaca 


cacacacaca 


cacacacaca 


cttagagatg 


actctcacaa 


1080 


ctgctatgta 


gttcttagaa 


attcttatgt 


ctgtgtgaaa 


ggcttcttct 


aacagacatc 


1140 


attgctgatg 


ctttacagtt 


tcacgtggcc 


cttgcctttt 


taatggcttc 


tctttactca 


1200 
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tctgcctcac 


agtgattgat 


gtcattaaat 


taccacgcag 


agtaatcata 


aatattgtat 


1260 


tatgatatgt 


atttgaaaat 


tgagtcatgt 


cagagaggag 


aggaactatt 


gcacaatttc 


1320 


ttcatattta 


ctattttaca 


taaatatcac 


ttaatggaaa 


aatagtgtca 


agatgtctgt 


1380 


actaaattgc 


cgtggtgaag 


ctgagcttag 


gaaagaaagt 


gcattataaa 


tattatcaaa 


1440 


acaatgtatt 


atcagacttt 


actttacaca 


ggggaatcta 


ttttgggagt 


agaaagcatt 


1500 


acctctaagc 


cttgttataa 


ccaagtttcc 


cgttatcatt 


agagctccta 


aaatacactt 


1560 


gcagtcttga 


ctataaaatg 


tcatttgtta 


caaaccctta 


atcaattatt 


ttgctttaag 


1620 


ggttttttat 


ttttgttggt 


ggtgatggca 


gtggcttttg 


ttttgtttta 


ttttttcatt 


1680 


accagcaaat 


ggccagcatc 


tactggtaga 


aataaagaat 


gggcgcttga 


taactacacg 


1740 


aatatgcatt 


gtttcttttt 


Gtttcttttt 


tttaacagaa 


caaaaaaaga 


aatataagag 


1800 


tttctattcc 


tcttttgcac 


acacatatta 


atacacacac 


atgcacgtac 


tactagggat 


1860 


tgatcgcggg 


gcctccccca 


tgctaaattg 


ctttgcctat 


cttacctaag 


ctcagcctta 


1920 


tttttaagtt 


ttatttggag 


aagaagtgtc 


cttagctcac 


ctaggctgtc 


ttgactctct 


1980 


atggagcaca 


gacatgcctt 


gaagtttgat 


ccttctatct 


cagcctccag 


tccagatgta 


2040 


gatataggtg 


tgtgctacta 


tgccctgttg 


tctacaaagt 


catgcctttg 


tgacattaag 


2100 


taatgttaat 


gaattggaat 


tagcttcatt 


caggttcctg 


gggcgggggg 


aagtactcag 


2160 


tcccacaaac 


tttcctcacc 


tcagatgcca 


ggtgcatgtt 


tctagttctc 


aggttgccag 


2220 


cacttctctg 


tgccctctgt 


gccctctgtg 


ccctctgtgc 


cctctgtgcc 


ctggctacaa 


2280 


agctaaagca 


ttcctactct 


cctactacgg 


gggttgggca 


tttgtaggaa 


caactcttaa 


2340 


ggactccaaa 


aacacttact 


tgctgttatc 


aatgaatgtt 


attgctatta 


ttgattatca 


2400 


gaaagagcca 


aatgaatgat 


gcttaaagta 


aagtgttggg 


gaggagacct 


gacccattca 


2460 


gggctttgtt 


ccagttgcga 


agtatcctcc 


agggaagccc 


actcagatgc 


aggttcaaag 


2520 


caatagccag 


tctttattag 


tcagccagta 


gctgcagtgg 


gtgttcaaga 


ccctggtgca 


2580 


gtccccagcc 


tttctcgagg 


tgagctttta 


agcacagagt 


ccacaccctg 


gattggcaca 


2640 


ccacggttcg 


tgagaacagt 


tagcaggaag 


cagaactaca 


gaagccaaaa 


agcgaggtcc 


2700 


ctatatttag 


ggactttccc 


ggaattatgg 


actttgatgg 


atttgacctt 


ggttgtagat 


2760 


ttggccagtg 


ttactgtcac 


atgaagagtt 


taaagcccat 


acaatgcttc 


tatcctgctg 


2820 
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tcagctatgc 


taaggtctgg 


gggcctttaa 


caatgcctat 


tttggacaca ctcttgttcc 


2880 


cacacctcag 


tgtgtatata 


tcagtgtgtg 


tatatcagtg 


tgtatatatc agtgtgtata 


2940 


tatcagtgtg 


tatatatcag 


tgtgtgtata 


tcagtgtgta 


tatatcagtg tgtatatatc 


3000 


agtgtgtata 


tatcagtgtg 


tatatatcag 


tgtgtatata 


tcagtgtgta tatatcagtg 


3060 


tgtgtatatc 


agtgtgtata 


tatcagtgtg 


tgtatatcag 


cctaaaggct ctctgaacct 


3120 


tcttgcttag 


gaggggatgc 


atgtgttctc 


ctaactaata 


tatatattta gcacatgaaa 


3180 


tattggattt 


cactttgaca 


ttttcataca 


cacatattaa 


tatactttgg tcatatccac 


3240 


cctcatcacc 


ctacattctt 


cttcactccc 


atggcttcct 


tttccctttc caaagacttc 


3300 


ctcctctgct 


ttcttttcct 


atgaactttt 


aaatcaaata 


tgtgtatata ccatgtgtat 


3360 


atgtgtttgt 


gtgtatgcat 


gtacatgtgt 


gtgcgtgtac 


atgtgtgtgt gagtgtgttt 


3420 


gtgtacagga 


gtactggggt 


agaggaaaag 


gcagagtgag 


catgagtaca gcaggaaccc 


3480 


agggatacac 


tatagccctc 


tgcaggctat 


aatactcatg 


tggctctgtg cctatggtaa 


3540 


gctgagctga 


catagagctc 


tgcattcctt 


agcccaactt 


cagttttctc agtcagttgt 


3600 


tactctatag 


ttaactaaca 


agctaccatt 


gcaagaaagt 


gacaaacttc caccacatct 


3660 


ctccatcagt 


tttacaattt 


ccctacactt 


ataattgttc 




3701 



<210> 79 

<211> 2346 

<212> DNA 

<213> Mus tnus cuius 

<220> 

<221> modified_base 
<222> (2281) . . (2281) 
<223> a, c, t, g, unknown or other 

<400> 79 

gataggaaaa ttttggaggg taaacgagga aagacaataa catttgaaat gtaaataaag 60 

,aaaatatcta ataaaaaatt aactaaaaat ataggtaata aatatacctg ataaatattg 120 

gtaactaatg ctaacctagc ttttcttcaa tattcaagga gaataatgat ttgtaaataa 180 

cgtagaatta ctaaaaagga aatgtaccat atataaagtg ctcaaaaatg cctacaatta 240 

tagaaagatc aagaatatta gactaaaata cataacaaaa ggcagaacac agagttattg 300 

agagaaagtg tataaatgag caaaatgtaa ggaacacacc tttatctttc tttgaatgtg 360 
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ggacatcttt 


ggagtcttct 


aaaaagaaca 


aaaatgaaga 


gtcagtagaa 


gaaaacatgt 


420 


cagaaaaaaa 


aaagaaaaga 


aaacatgtca 


gaatgaatat 


tatataaaca 


atgctgaata 


480 


agcaagtact 


tagagaaaag 


gaagcctttt 


tacttttttg 


gatggaaatt 


aaattggtta 


540 


ataactcacc 


agagaaaagt 


tatggagttt 


ttcaaaatca 


tgaagaattt 


gtctgtcatg 


600 


caatattcag 


cgggttcact 


taggagtaca 


tgttatcatg 


tgcttaaact 


tttctaagac 


660 


agcaaatggt 


aagattgttc 


agttatttaa 


gtgtgtgtaa 


gtacgaaggc 


ttgtgtctag 


720 


ctggcaagaa 


ttaaggtagc 


atgacatggt 


tcacaactat 


aatcctaaca 


ctataaatga 


780 


ggaaccaggg 


agattcctgg 


agctggctga 


cagctagtct 


tcccaaatca 


atgaacacct 


840 


gctcaaaaaa 


tcacggtgga 


gagctatgca 


ggaaaacact 


gatatagaca 


tggaggttca 


900 


acaagcttgt 


gcacacacct 


gcaagaacct 


acacatgtgc 


acacgaaagg 


caaacctgaa 


960 


gccatgacac 


acacacacac 


acacacacac 


acacacacac 


acaccacaca 


cgattctgta 


1020 


agacaactaa 


ttataaagtt 


cttaatataa 


aggtttgcat 


ttaataaata 


ttttcttcaa 


1080 


ggaaactaag 


aggaaaagaa 


gacatttaat 


gtaaagtgga 


aaatcagaga 


cactatttat 


1140 


gtaacaatat 


tactataaat 


aagttatatt 


agttaaagtc 


aattggtctt 


ttatacattc 


1200 


tgacttgcag 


gcacatatgg 


actcctgtgg 


aagtaatttt 


aaatcagaga 


taacactaac 


1260 


aatatacttg 


agccatcttt 


tattttcact 


gcaaactttt 


agtactgttt 


tatttactaa 


1320 


gagtttggat 


accatatttt 


aactgataag 


ttgaaaaggg 


ggaagatagc 


tcagtacgaa 


1380 


gaattttttc 


tgatacaaag 


aaatgtgatc 


ttcaaataga 


agtattagtg 


tgggtggtgt 


1440 


gtggtcccat 


gtgtgtgtgt 


gtgtttgtga 


atggtgtatg 


tgtacactgt 


gtatttatat 


1500 


gtatgtgtgt 


gtgatatgta 


tgtatggctt 


atgtatatgt 


gtatatctat 


tctgtgtgtg 


1560 


gtgtgtatat 


gttgttaata 


tgtatatgta 


ttatgtattt 


atgtatgtgt 


atgtatgtat 


1620 


gcatatgtat 


ggttatatgt 


gtatatatat 


atatgtgtgt 


gtccaggtcc 


tattctgcct 


1680 


tttttaattc 


tctgggtttc 


tccctttatg 


agaggctttt 


agctatagtt 


aacttttttg 


1740 


aaagcagttc 


ctgataaaat 


aagagatttt 


cacttcttgt 


tcaagttatc 


ctcatacatg 


1800 


ataatgttaa 


aataatgttt 


ctgaatattt 


gccaataaat 


gtatctaata 


actactttta 


1860 


aataatgcca 


tcctagtaat 


ttccttatat 


acaaagatat 


gtatagttta 


gcaattatga 


1920 


aagtgtcaaa 


gtatttctat 


cccataaatg 


ttacttagaa 


atgtcaataa 


ctataccaag 


1980 
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agatcaaaaa cattagacta aaacaaacca 


accacataat 


acaaaatatt 


taatttttaa 


2040 


gagaaagtgt gtaataacca acatttaaca 


aactcccctt 


tatctttctt 


tgaagttggt 


2100 


gcttctttgg aatcttctag agaaaaataa 


aaaataaatc 


acagtctaag 


ttctttttgt 


2160 


ttttttctaa aagaatattt tttactttac 


ttaaagggtt 


tttttttggg gggggaagcc 


2220 


agttagaggt tctctctaat cctaaataaa 


gtgttcctct 


tttttatata 


aaaatttaat 


2280 


nttttatatt taaactatag attatacatg 


attttgctaa ggagactgcc 


ccaaacaata 


2340 


tcttcc 








2346 



<210> 80 

<211> 3320 

<212> DNA 

<213> Mus mus cuius 

<400> 80 



gggagacacc 


ggctttagcc 


gggacaagca 


gggtgggtcc tctgtggcag 


gaagggtggc 




tgggcttgct 


tcaggttcaa 


cgtagagatc 


gcagagagca tcgccactgt 


cgctatccct 


120 


ggggatgacc 


accttcttgc 


tgctgctgct 


gttgctgatg ctctcttctt 


ctgggccagg 


180 


ggtggtggag 


tcccagtagc 


cctcatcgct 


gttaggaaca ccttcttgat 


gctccttctc 


240 


cggatgtttg 


ggctcctcct 


tctgctggga 


ctcaatggga attctgtgga gcctcctgcg 


300 


cttgaccgag 


gacatgtcct 


tggctgcttc 


cccacaccgg gtatcactgg aggtctcagg 


360 


ggccaacttg 


atgtcggaag 


cagtggctgc 


cttcgctgcg ccttcttgag 


ttccttgtcc 


420 


ttggtcctca 


gtctgggaca 


acatgtccca 


gaattccggg agataggtgt 


cgtccacctg 


480 


atccgggctg 


gccatctcct 


cccctcctcc 


ttggtaggcc accacgctgg 


cgttcttttt 


540 


agagagaact 


ggcttgcctg 


gcccggggac 


atgcttgtca cagctggggc 


ctgcctcttc 


600 


ttctgggtct 


gcaataatat 


ctccacagcc 


tgtaagagag tcaaagcttt 


tcagtgaagt 


660 


cacgtcagaa 


aacatcaaac 


aaatacgatc 


tgccgacggg tctgagggtg gatcaacaga 


720 


ggaagggtca 


gggacagcag 


acgcccggcc 


gctgccacct tcggagtcga 


caagggggac 


780 


ggtctttatc 


ggaacgtcac 


ctgtcttgga 


cgcatcctcg gccgcctgga 


cctcaccggc 


840 


tcccgactcg 


agggctgccc 


cgggcttctc 


ctcgcgccga tgcccctcgg 


cgtcctctcc 


900 


ccgggagctt 


tgatcgccag agccggtggg 


gctcccggcc tcgaggcaga 


tccgctccgg 


960 


ggcgctctcg 


gcggacgcgg gcgcctgctc 


tccccctgcg ggctcacctg 


ctgcgtgtcg 


1020 
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cgaggcgtcc 


tggcccgggc 


tgtccgggcg 


gcgcgcggct 


cggggcggct 


cctccttgac 


1080 


gcactccagg 


ctggcggtga 


gcgaaccggg 


cagaaccagg 


ttgcccgggc 


ccgccgcgcg 


1140 


caccgccttc 


tcctcctcct 


Gcttgccgcg 


cttgtccctc 


cggtgccagc 


gcatgctgct 


1200 


gaagatccct 


ttcagccccc 


tcttttgttt 


gccgccagcc 


ttgctcgcct 


cggcatggtc 


1260 


tcccttgccg 


gtctccgatc 


gcccgttctt 


tttcagcagg 


gagaagaagc 


tgtgggactt 


1320 


ggccaccgag 


ctgctggcca 


gggagcccag 


gccaggcccg 


gagggtttgg 


ggggtccggg 


1380 


gatcggccgg 


gccccgctgt 


gatcgctccc 


gccaggcggc 


tcctccttct 


tgctgccctc 


1440 


cagcaccaac 


acctcggcta 


gtccgtcgtg 


ggtcctgctt 


ctcaccattc 


ccgtcggccc 


1500 


cgagctcttc 


ccatcccctt 


tgtttttgac 


cccaaaaatg 


ctgggcatgg 


tgccccccga 


1560 


tttccttttc 


ttgaataact 


tgaacgcagc 


cttattgatc 


ttccccgacg 


gcggctctgc 


1620 


ggcgggcgtc 


tccgcggcgc 


actcacaatg 


cgagtccatg 


tctgcagcga 


gggccccggc 


1680 


ctgctcctgc 


ctcccgcaga 


cccccgcgcg 


cgcgccgccg 


ccgcgctcgc 


tgacagccgc 


1740 


gccgccgccg 


cggctcctgc 


ccgtctccat 


ggaaaccgag 


tgggatcagc 


cgctgtcgtc 


1800 


agccgtaggc 


tcgcgccagg 


ccactgtcac 


ccgcgttcta 


aatcaacctg 


agcggctctc 


1860 


gctgctgcgc 


tgtggctact 


gcaactgcgg 


cggctgaagc 


gcacaaaccg 


gactatctcc 


1920 


cctcctgcca 


agggcttata 


taatatcctg 


ggtgccacat 


aggtttttgt 


gtagcaagag 


1980 


ccttcatcgc 


agcctcgagg 


cgaaagaaaa 


aaatgaaatt 


cataattatg 


ggctttggat 


2040 


cccagtgcgt 


tgattctcac 


cacctcaggt 


tcttcttcct 


ctagcccagc 


ttttcctagg 


2100 


atcgcagctt 


caaaaattaa 


cacatgcgtc 


ccccatgtcc 


acaaagaaca 


atgccttttc 


2160 


ctttatcatc 


tgttttggac 


ttcatgcagc 


ctcctatctt 


tctccggtcc 


ttcacatagg 


2220 


tcttagcgtg 


tgggtactaa 


gcccaagagg 


caccatgcca 


atctgttcac 


gcctgcgtct 


2280 


ggtccagtgc 


cgaatgtagt 


cctcttcagg 


ttgatgctgt 


attctgatca 


tgtggctctg 


2340 


atttcctgtt 


acagaagcaa 


cagccgtcag 


tcaaatacca 


attagcagct 


ctcagcaacg 


2400 


aaagtgcaac 


agtcccctag 


ggttcgctgg 


tcctttggga 


gaccacaaag 


cttttatccc 


2460 


ccaccccagt 


gacatggggc 


ttaagtcaca 


gaagccacca 


attggtcaca 


atagactgag 


2520 


cacactgaga 


acttaggttg 


gagaattctg 


atccctctgc 


aggtgggtct 


tagaggcctg 


2580 


ggctcagtct 


tcctctttca 


ttcaggggta 


tggtaagtat 


acctctctct 


gttgtgccca 


2640 
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gcctacccct 


gtctaagaac acccaaggac cccactcgct gtctttccca ataggacagg 


2700 


ggggaaaacc 


agacaggggt cagaggtcgt ccctgtggtt gcttctgacc ccactggtct 


2760 


agaggcaaag 


ggaactgggt tccagagaac aacctgtgtg ggtaaggttt ctgtgttctg 


2820 


actttgaaga 


caactgctaa cattgtagct tttcctaacc aagggcttgc cagcaaaata 


2880 


gggaaggatt 


tggaaaagat tcatgtttta taatccttaa taaccctatt ttaaaaatat 


2940 


tgattatata 


tagtgattat ttgattgtat atactattta catacaaaca ccagtctttc 


3000 


ctgcttctct 


tgaatttatt ttccacaagt tacttatatt ttgcagattc agagttactt 


3060 


atatactgta 


cttatactga actgcagatt cagttgttgg ccaaggcagc ctccagggtt 


3120 


cacagttaga 


ccatggagtt gagcagtcat gttgctcttt ctcatctcaa tgaaaaaacg 


3180 


gctgtaaact 


gagactggaa tcaaaggacc tttctgtttt cagagttcaa accccaccag 


3240 


ctctttccac 


ctactaacag gactctttat ggacagaata tacaacaaca gacttttaaa 


3300 


aaaaaatgag 


ttttcttttc 


3320 



<210> 81 

<211> 1573 

<212> THHA 

<213> Mus mus cuius 

<400> 81 



gtaggagaga 


aaaagaaaca 


actatagtga 


agtgatcagt catttcattt 


tttttttttt 


60 


caagcaaggg 


ctgttttgaa 


agaactcaaa 


aaaaaataaa aataaaaaca 


atgtgaggtt 


120 


taatatattg 


ctagacttgt 


ccaacagaca 


gaaaattaga actctgagaa 


cttggaaaca 


180 


cacagtgtag 


ttctctgcct 


ttgggtttaa 


taacttaatt ataagggatt 


agtcctgatg 


240 


gcagcagtgg 


ttcaactttg 


ttcctctatt 


tggtcttatt gtttttttaa 


cttaaagaaa 


300 


ataacccggg 


aacttgacat 


catatgcata 


gactgtgttt cttaacacac 


aaatggacat 


360 


actatacttg 


cgtatatata 


tgggatagac 


aggttttgtc tccaacttca 


ttgtttatga 


420 


agtcatattt 


tgggttttac 


accatctccc 


ccccataagt tattacagaa 


aaatatcagt 


480 


ttagattgga 


caaagttctt 


ccccctctca 


ttaaaaaaga caaatgaagg 


agagggatac 


540 


attttaaata 


tttcctttct 


ccccctttgt 


taatatttca gtgcattttt 


tttaccttgt 


600 


ctacatggga 


aatggcttac 


tccccacaca 


atagaatatt ctggaaggtg 


taggtaagaa 


660 
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aacaaaacag aaaaggcaat ggctatactg aaattaacac cattaaaact 


gtgatcagtt 


720 


taaaaaatta aatagttatc agcacaaaag ggcgttaaaa gggaaaacac 


ttttttatta 


780 


accttaaatg tttcggattt tatttctttg ttaggtatca gataaatgtt 


atttcaaaca 


840 


attaaattct cactaccata caattatggt tcagcatcag attagcactg 


cactccgtag 


900 


ttaaggtttt aggaaaaatg ctttatccag tattgtcttt acaaacatct 


gtgattgttt 


960 


cattttcaat gtttttacaa gataaatggt gacttataat gggcatattt 


atttgcctgt 


1020 


atttcatttc ccccaatgaa tgtcacagga gatgccatag agctatttca 


ggttatatca 


1080 


cactgcttgt tcctgatgtg tggggactgg ccttcagtga agcaatccag 


agaagggcaa 


1140 


atagccaatg gtaaaaggag gaaatgaatg tgcagatacc aagtaggtaa 


ggtccgaagc 


1200 


tggggttctc tcttgctctc ttaggcttac aaagatggac attaccactg 


aaccttacca 


1260 


tatgtatata tgtttaatat ctgtcttttg aaatgcagaa atagtttaaa 


tgtttctttg 


1320 


tctatttttc tttttctttt ttttaaatgc tacccaggga aatattttca 


tatcgtttta 


1380 


cgtggcctgc ctcaatgtat atttatttct ttggagcaaa aaggttctga 


aaactggttt 


1440 


tctgtagctt taaatgagta ggtagcaaga tctatatggg atgtcatttt 


tttgttcagt 


1500 


ttctttttaa aaaatgcttt gttttgatac atttggttgt gcttgtgggg 


aaaataaaag 


1560 


cgcagagatc ctt 




1573 


<210> 82 

<211> 868 

<212> DNA 

<213> Mus rausculus 






<400> 82 

taatgtttct cactagcagt ctgggcatat gctggtgttt catctctgcc 


caaataattc 


60 


acctcctaac ctatgtgtgt gtgtgtgcac atggatgtgt gtgcctgagt 


gtgtgagtgt 


120 


gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtttat gaacagtata 


ggttttaaaa 


180 


gaacagtatt ttacaaaagc catcactttt ataagagttc tgtaaaggaa 


ggatgtactt 


240 


cttcgctcac tatagtttaa aaaaaattct attttagagg aaaaaaaaag 


aaaatatgag 


300 


ggctctgagc atgactttta taactagttt cagttttatc taataactta 


cttttaaaaa 


360 


atcaatattt atcaataatt ttcatgtatg ctgtgctttt tgagagacat 


gtttggctgt 


420 


tcagtaaagc ataagagtat atggcctaag aatcaaaaag aaatgaaaaa 


ataaaataaa 


480 
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aagagggaaa agaaaaatga 


aaagggaaga 


aagaagatgg 


agcaaacaga 


aatttctatt 


540 


gtattttagg aaaatgcatc 


attttcaagt 


attttaaata 


ctgagttaat 


attgttattt 


600 


ttttttaatt ccagcagtaa 


caaaatggtt 


taacctaact 


ggcttgctaa 


gaaacgtgta 


660 


agtcttagat ttgaaagaat 


ttaacattca 


tagctctgga 


ttgaaatcaa 


aggtcatttt 


720 


ctgcaaaatt ccttccttag 


atcaggtcag 


tgctctgctc 


ccttcaaaat 


agaaggggca 


780 


agctagagtg gcagagcttg 


tggcttctat 


ctgacacttg 


ctttgggtaa 


acaggacctt 


840 


tccctcctgc ctgcatgtat 


agcacttg 








868 



<210> 83 

<211> 1888 

<212> DNA 

<213> Mus musculus 

<400> 83 



gagagggaac 


tttgtggagg 


ctctggcagg 


aacgtgtggg 


ttctgtatac 


ccaaggctgt 


60 


gtgcctaggt 


gtacctttcc 


ctgggacttt 


gcctctctgg 


atcaggtgtg 


tgtctgaacg 


120 


tacatcttcc 


tgtccctctg 


tctctcgctg 


gatctgcatc 


cttgcctctg 


caaacacacc 


180 


tctttgaagg 


tgtgtggctg 


cctgtctctc 


tgtgtcccta 


tctctgagtg 


tcttgagtct 


240 


catttctcta 


ccagtctctc 


tgtgtatcac 


tgtctctttt 


cacgactgca 


tgtccctttg 


300 


ccctcgggga 


caggtctcca 


tggggtcggt 


atcggtggtt 


ctgtgcgtct 


tgcatgtgtc 


360 


tctgcctttg 


tgtctatctc 


tcctccaagc 


tgtgtgtctc 


tctgtctctg 


tgtgggtcac 


420 


tgcctcttta 


tactcgtcca 


agatctctgt 


ttttcgaccc 


tctgcccgta 


ggggagcccc 


480 


ccttacctga 


gttgagcccg 


gtgcgcgccc 


gcgagagggt 


ccgggacaac 


gggatgctca 


540 


gcgccgactg 


aggctccaga 


gcctgcgggc 


ggggggcggc 


gagaggcggg 


caggggcgac 


600 


gtccactccc 


cgggccctcc 


ccctgggccg 


cgcagcctcc 


gccggccgct 


cccgccgggg 


660 


cgcgaccggg 


ccggcctctg 


ccgcgggcgg 


tctgcggaga 


gcgggcacct 


cctccccctg 


720 


gggcggcgcc 


tcctccgcgg 


ccggggccgc 


tagctcgctc 


gctcgctcgc 


tctctcgctc 


780 


ggcagcactc 


gggcggcggc 


ggcaggagcg 


tcgaggagtt 


cggctgggct 


cggctgcggt 


840 


tcggctgggc 


tcggctacgg 


cgggagggag 


tcggctaagc 


tcggctgagc 


tcggcgggca 


900 


gcaaaaaaca 


gcggagagcc 


ccgcctcgtg 


agtggccagc 


ggcctgcggc 


cctttggggg 


960 
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aggagccatc 


tctgtgccgc 


tgccgccgtt caattggctg tccgggccaa gttcccgcgc 


1020 


cattggttgt 


gctctggggt 


gggacacgac catcctggtg caggagagaa ggagggaggg 


1080 


gcttgggaag 


ggtgagagga 


ggggtatata ggggccacfct tgttttgatc catcccttga 


1140 


aaaggtgaag 


ggtcagccca 


cctttttgca gggatttctt gcatcttcca cacctggaat 


1200 


gcacccccac 


atgtgaagcg 


tccacccctc catatctgag agaagtctgt ttctattatc 


1260 


tgagaaagtc 


catgtgcctc 


cacacctgac cctcctcacc tatcatcttt ctacatatcc 


1320 


ctctcccagt 


agagtgtaga 


gtacctcatg tccactttac agcgcctgta gattgcccct 


1380 


agtcctaata 


acagtgtgat 


ctctgggaat tcatgacctt tgttgcgttt tactgatttc 


1440 


tcacccagtg 


ttggtcacag 


ggttgtgcat ggcttgctgg agggaacttg agacacctga 


1500 


agaatggccc 


agctctggct 


gaactagatc cagacggcat ggatggcctg actctccatc 


1560 


ctaacagttc 


tgagaccaag 


gtagcttgga caggagcagt caagtaggcc ttcccctctc 


1620 


agacccttaa 


gtagcagggc 


tcacacatgc aaggagctac atttctggcc ctttagtgat 


1680 


tcttaagtgt 


acttgtgtgt 


ggcttctatc acatattgtc catggacagt cagagaccca 


1740 


tggccccaca 


tccagcctta 


acttccctca gaatactata ctcacagcaa aatgcatact 


1800 


ggagtcacag 


tggtgttggt 


tcctggttcc actttctaac tctcttaaac gaagctttct 


1860 


tctcaataaa 


actgtcatag 






<210> 84 

<211> 3946 

<212> DNA 

<213> Mus musculus 






<400> 84 
aaatagcacc 


aggcctttgt 


gatgatgtat aagaatacat gcctatctgt tatttaagaa 


60 


ggaacaattc 


tatagctttc 


■actaagtagc actgagtcta cttatctagc cttaaaaata 


120 


tatcataaac 


tcggagacct 


gatgccttaa gaaaccattg actttgttat taataggaca 


180 


caatgtttcc 


ctgtttttaa 


catttatttg tttatttgtg tacctgtttg tatataatat 


240 


ttgtgtgagt 


ggtatggaac 


acatgtgcta tgacacacat tagcggtcag agaacacatg 


300 


taggagtgga 


ttctctcatt 


ttgatctttg attccaacaa ctgaacaatc ttatattagt 


360 


gtgattatta 


atgtgatgat 


aatattaata ctaataatag ctgttcatca ggcattaaac 


420 


tttttctact 


ctacacactg 


ttctcattgt tttcaatcct gtttattatt ccaacatctg 


480 
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fcctatt ctac 


agaggagaaa 


gctgcagctc 


agaagcatga 








tcctgtaata 


agaagctggc 


attcagtctc 


aggactagtt 






t- -h i- -h f- -1- -1- -1- s a 


a.gagttggta. 


cggcttttga 


agccaccatt 


gtctagcgca 


660 




ttattgtttt 


ttctttgaca 


tattgactaa 


tgcttctcag 


tgatgttatt 


720 


t tt t cbtc cb 


ttggtttgtc 


tttgaagagg 


aaatgctcaa 


aaacagaaag 


ggcagcgatc 






tgfcfcaa tcag 


attaatattc 


aaagttccaa 


gctgtacttg 


tatacacatt 


840 


ta.gtca.t9ca. 


attaataaat 


taatacttta 


ctgattttct 


tctctgtgct 


aggcactgat 


900 


a. a.a. a t a^a. 1 


tagaaatgct 


aacgtgatct 


agctcgggaa 


agaaactttc 


tgtaccccaa 


960 


tcaaactcfat 


caatctttaa 


aaacaaatac 


tttttcttat 


cagcaccatg 


attcatctta 


1020 


aaacagacfS^S 


tctgctgaag 


tttgtttgaa 


cattgttaca 


atagctttga 


acattatgat 




agccagtttt 


caaagtagtc 


tgccagcta^ 


tgtttaactt 


tacc cagagg 


gtctaatatc 






taaactaatt 


ttaaaa^aaa 


acagtgttca 


cctttaaaaa 


tgtatttaaa 






attaataaaa 


aatgtatgaa 


tgfccafcctca 


ttfcfcacfcgfca 


ttcttgaaag 




aaagtgagtg 




atatttagca 


gtgggaatta 


taaaatcata 


agacgattaa 








caatattgct 


ttttaaatga 


cttcaggata 


taggaggtac 




taa9'9^ttatt 


atctgttatc 


tggaatacag 


tggaattgta 


caactggata 


ataaccaaac 






aatgtaaaat 


tacattatca 


gtgtgttaaa 


acaaaattgc 


gccctaaagg 




agtcca.catg 


ccagctcaga 


gctgcagtgt 


aattafcgcfcfc 


aactctgata 


attaaaatac 




aacggagatc 


aacagaatca 


ttttcattaa 


gaaaattatt 


aatgatggtg 


gtatagcagt 


1620 


aattgatttc 


agagcaccac 


aggataacgt 


gacgctggaa 


acccaacttg 


tttaatgatc 


1680 


gtctgaggct 


ttttgtggtt 


atgttctgtt 


aaaatgggaa 


atacattttg 


aggcatgagc 


1740 


agtttattat 


gcctgctctg 


agaatcagtg 


attgggaaaa 


tggctgattt 


agaaaatcac 


1800 


agttttgatt 


caaagggtca 


ttgtggagag 


gtttgcaagg 


tcaagacaag 


cggccccacc 


1860 


aatgatggag 


ttattgttaa 


cagagttttc 


tattgcaaat 


agcagaatgg 


ccgcgtgttc 


192 0 


agcaaatatt 


ggactctctg 


ctgtttccat 


ttctttctgt 


gcattgccag 


gttttacaat 


1980 


tgtgctaatg 


gcttactagt 


attgtgttga 


cccaagtctg 


cttacacatt 


tacagttatt 


2040 


aatgtatgtc 


atgtatgcaa 


atctatactt 


gttttcatac 


agatgtcatg 


gtgggtttac 


2100 
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t ag" fc aa a t aa 


gccfcgtaata 


tatcattgtfc 


accttaafcga 


gggagatgtg 


ctaacatgtg 




gtagaaaaaa 


atattttaaa 


cccacttgga 


tccatatatg 


agtgaagctt 


gtgtctcttg 




a^cf 9^a 1 3.QSLSL 


actaagttag 


actgttggca 


aatgataaga 


actggggata 


tatcagtgtt 




3^ tCf t a 1 3.3. 


aaatgagaaa 


tgaaaagcag 


ggtctttgga 


gaggtgaatc 


actatcatgt 




g3ggatgtta 


gcttgtcagt 


caaacagatg 


ataatcttga 


aagctacccc 


cagaagtatc 




tttgtcagct 


aaactgaggg 




agagtaaaat 


tttttagaat 


aagacatcct 




a.a.a.ta9Cttt 


ttaaacatca 


tacagaatag 


atgaagttgt 


tgccttataa 


tgctagttga 




gaaaagcaga 


aaatatgctt 


ttgtatccac 


tataattaga 


gtaacattta 


aaagcatatg 




cacaagaaac 


agagggaaat 


aaaattgcat 


gaaattgcaa 


aagtatttct 


gtaaggagaa 




atttgtggcc 


attatttgtc 


tctacttatg 


ttacttattt 


gagagagaag 


ctaaactcca 




agaggcacta 


gcaaagtact 


tgtgcaaacc 


actggacttt 


gaggcttgta 


tcctctctta 




taaataaact 


tcttagagtt 


tttttcctca 


tagagtttaa 


gaaaagtact 


tacatattca 




aatcctatgt 


cttaatctgt 


tctcagactc 


at teat teat 


tcactcattc 


actcattcat 




tgcattccac 


tgctggagat 


gaaacccttg 


tctccacgtc 


tacacatggg 


ctgggaacct 






tgcatgattt 




tttatgattc 


aaaccctaga 






tttaattcac 


aaaatcctta 


ttccacaaaa 


attgatgttt 


ttgcccctgc 


ttatataatt 




aaactttctt 


aatacacata 


tgaatgcctt 


aaaagtattt 


ttccttaaga 


tagtttccta 




aaaaaatata 


agtaatcaaa 


cacatgcata 


ttctccaaat 


tatggtattg 


gaatgaggat 




acaaattaat 


ctaaaacaag 


ctttggttta 


acaaagtcac 


tgaaaccttt 


attaatcaag 




tatacttaaa 


catttaaata 


atgacttcag 


tcagtagctc 


ataagcaagc 


cgagactcat 




gaagggtgtg 


aaagttggtt 


ttgttttagt 


ttgtttaaaa 


ctaagccatc 


agttctagga 


3360 


atgggtgagg 


gcatttaaag 


ggtttgaatc 


tgtgagccta 


gtggctcata 


gcgtgggatg 


3420 


ctcacagagg 


acagttttgt 


aaggctaggc 


aacttgcaaa 


tgagaaccgt 


gatgggcgaa 


3480 


ttagtttccc 


accattcact 


tgactgatgc 


cgagtctttg 


attatcacgt 


catggtagag 


3540 


ctgatcatcc 


agcaacatcc 


gtaatccaaa 


tgagagtttg 


tcagataact 


gtgtcaaaaa 


3600 


caaacttgta 


tccacattca 


tttgaaaact 


atatcccaat 


gggaaaatgg 


tgttgtagcc 


3660 


acacacttct 


agcaggtaat 


aaaccagcct 


ccttcctcta 


cttgttacct 


gaattcgttt 


3720 



147/186 



wo 2005/005597 



PCT/US2003/027106 



tccttattcg taggtcaact ggcttgtgtc acatggatcc atcattctaa acactccatc 3780 

cacctcttac gcttctcttt tcatggcctt tatcagcttg acaatactgc tggccatttg 3840 

ctgttctccc caacttactc ctggttggac ccagcagaca gatggttact atttgtgaaa 3900 

tgctcggaat agcaccgtct tagttccttc tctcattcct gtgacc 3946 

<210> 85 

<211> 2805 

<212> DNA 

<213> Mus mus cuius 

<400> 85 ^ 
taacatatag aagaagatcc caaggatctc aaagtccctc agatgtcacc ctcttggtgg 60 

cactgaaaag atgcattgca catcttcctg ggcacctttc agaaataata gttttgagac 

taggtccacg tgatatggag agcattcagt gtgtgatggt gagactgtgc gcattcttcc 

gtctcctgct gttacagaca gttaatgtaa aaatggcctt ttattggaaa cacaaatatg 

tattcacttc aagaatagga gcagaggggg gacaaagtgt ctctcccgtt ctctgtttat 

tgtgtgaaaa ctaaaacaaa caaacaaaag cggcaagtgc tgtaatttct caagtcaagc 

catccttgtt gtgggcaggt tgtgcacttt ctttcttcaa ggagctgaag tatgctttgc 

acacctttct cccagggagt ggcatttaac tgtggggctg tccaggggct agaaaatgca 

gctggatgtg gaactgccag gctgggccag ggcaggcacc ttcagttcag ggagaggcga 

attgggcaca ttaatggcac attttagagt cctggaaaat cattagcttt acactgtagc 

ttctgtcttt ggccaacatg ggggagctaa attgtggcac ataacaaata tggaatgcag 

attttagatt ttcatgctgt cttcctggtg gtgttgaata aaggacgtgg ggcaaatggc 

ttttcatagc tgccacagag tcagaaagtt tggctttcct gtgggggaag atgtaagatt 

gaagggcgaa ggagagccca tggaagatgt cactgggcac ttatctgatc tatcttggtt 

gcatccttta tgacacaagg aaattatttg acaaaattca cttcaagacc cagtctatct 

tatacacagt agtactagcc tctttcagta aaaagcactt agcatcaaac tggaccgagt 

ttggcacatt gccacaattt atcgtgagat ttaatcacca agaatctttg gtaccttttt 

gctagttcag atttcatttg ggtcatgact atgtcagcac tgtttctatt tcaaagaaaa 

aaatacgttt tttacagtag ggcccatttt taatatttca aacaggtttt tattttatgt 1140 



12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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gttaatttgg 


aggatgttaa 


taaagttcat 


aaaatatgtg 


cagactatat aatttaatgg 


1200 


aaaatagctg 


gtatttttta 


tttgtaatac 


aaatagagaa 


ctgaccactg tgccctcctc 


1260 


tctgtccctg 


gagtggccca 


ggtaaatcag 


tgcccttaag 


ctgcttttcc 


ctttcgctgc 


1320 


caagtctgca 


gacccggcca 


atatctgtaa 


atttaccttt 


gtaatttgcc 


atgtagtttt 


1380 


tgcaacaatg 


acctaattaa 


tttgagcacg 


aaccatatta 


ttgctactgg 


actcattttc 


1440 


tgtcacaaac 


ttaatttcca 


ggaaatgcaa 


cctgacgaat 


aagaattctt 


tagctctcca 


1500 


catgtgttct 


ctgagaccaa 


aggcaaattt 


aaaataataa 


taataaatac 


taaaatgcca 


1560 


gcattattaa 


aggcaatatg 


cttgatgcca 


aactcaattt 


gaagctagta 


aacatcaaac 


1620 


tgtatttcta 


atcagttttc 


gaatgtaacg 


tattccatat 


tgagtttgct 


ggattctgtc 


1680 


tctgcatttt 


actggccagc 


tgcactcccc 


tctgcatttt 


taaaacattt 


cagcaaagga 


1740 


ttttgctgtt 


cttagcaggg 


ttagtaactt 


ggggtctatt 


tctgagctca 


ttcgtcattc 


IBOO 


tgagatggca 


ttgagttagg 


ctggcaaggg 


aaggatttga 


ggcatggggg 


ggaggggggt 


1860 


ggcgaggtca 


cttctgatcc 


cagcagggaa 


taggtgagct 


tcatttgcct 


ttacaatagg 


1920 


cgcacagtta 


ctgcaccttg 


gaggggctct 


caggtgccgc 


tcagatgggc gcatgtaaat 


1980 


gccctgtcag 


atgtggcgtt 


ggaatattaa 


tgcttcctcc 


accgctgcac 


cataataaag 


2040 


ctgtacacag 


cgagcttaat 


atgcagctag 


gctagggaat 


tgtataaact 


tagatagccc 


2100 


agtgtaagag 


acagcgatgg 


aagaatgcgc 


ggtatgctac 


agttcccagc 


ttgttctgct 


.2160 


ttgatctata 


gcaaaatgaa 


aacatcatct 


attcctcctc 


gcagagttgt 


ctcctcaact 


2220 


ctgccgcctc 


ccagtccacc 


tccagttagg 


attttctttc 


cttttccttt 


tccttttcct 


2280 


tttccttttc 


cttttccttt 


tccttttcct 


tttccttttc 


cttttccttt 


ccctttttct 


2340 


ttttcttctc 


tctctctctc 


tctctctetc 


tctctctctc 


tctctctctc 


tctctctctc 


2400 


tttctctctc 


tctttctgct 


ggtatggaga 


aactcaacat 


aatttgtttc 


atatggttaa 


2460 


ttacaggcat 


ttgaatattt 


aaaatttaaa 


atagccatgg 


ctccagcttt 


ttctgttaaa 


2520 


aagtatgtga 


tttgaacaaa 


gggattagag 


tgtaaatatt 


tgtggtgatt 


tgcaacaccc 


2580 


ttcttctccc 


attaaacaca 


cacgcacaca 


tccacactgc 


tgatttaagg actcccatac 


2640 


ctttatttat 


acttcttttt 


aaaaacctac 


tctcaattgg 


agcgccatta aaatactttt 


2700 


ccattatatt 


tttgagtctg 


ttattgcttg 


aggttttata 


gagaattttt 


atcatgtgtt 


2760 
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catttaacag aatcaacagc tgtcagaacc agttgtgcta aatcc 2805 

<210> 86 

<211> 3481 

<212> DNA 

<213> Mus musculus 

<400> 86 



taacggaagc 


ttcccatgtg 


gcacttagca tcatatgatc 


ctcagctcag 


gtacataggg 


60 


tcatcacact 


cctcccctga 


agtcatcaca ctcctccccc 


gaaggggagc 


catagttatg 


12 0 


aaggtcagtg 


attggcacac 


ccacctgatc aggaacctgc 


tgtgaattct 


ctaagactgc 


180 


atcagccttt 


gtgctctttg 


ctgaaaagca ggcttgggct 


ttctttgctc 


ttacctttaa 


240 


ttacaggaaa 


aatgtcagcc 


tggttgtgcc aatggttcct 


catctcagaa 


gtgagacttc 


300 


aggtgattcc 


cctaagcaat 


gtgt^attca aggcaattcc 


ccagagtcat 


aggaagacac 


360 


tctccattcc 


aagaacaata 


ctctctcctt ccctcccctc 


tacctccatt 


ctcagggtgc 


420 


cttagcgaat 


gtgaggaggt 


ctattgaaca tctaatcctc 


aaatcaaagc 


caagcagaag 


480 


gggaacttag 


ccttctgaag 


atgtattaca aagaagtgct 


agcttttccg 


ggctatttta 




tgagggaaac 


aaattccagt 


cgbggataag ctagtcctaa 


atctaggaaa 


tcaatcattt 




attcttttaa 


ggttccctaa 


tttcacctca tttagaggtc 


cttaaacctt 


tccaaagaca 


660 


ctgttacaat 


tccctcagag 


gcagtgagga atgagccagc 


ctgtgccgca 


gagagagatg 


720 


agcttccgaa 


gcagatgggc 


catggccatg ttctgagaac 


cccaccttta 


gtgcatcttc 


780 


agtgcagagt 


cgcacagtgt 


tgttgcctga gtagaactcg 


gccatgacat 


aacttttgac 


840 


caagctacca 


aaggggtctg 


gtctctgaaa acagcgtgct 


ctggacctgg 


cttgctggaa 


900 


gccagagaac 


agttctgtat 


tcgagtcctg tttcaatgtc 


tgctatcaag 


ataatgttaa 


960 


gtatctttct 


cctctgcagc 


ctcaattgct tagagtgagt 


attgtacaaa 


tgccaggttc 


1020 


tgccacagcc 


ccagttttcc 


catctgtaga ctgagggcac 


tttaatgtag 


cctatgtatt 


1080 


atgaggattc 


agtgcagtgg 


gggcagacat tttgccttgt 


aagtattaga 


tattcagggg 


1140 


ttggtggctt 


acttactggt 


agctcttctt tatttttttc 


cccttaactt 


tcttctcagt 


1200 


agattggtcc 


tcagctattg 


aagttataaa agttatcaaa 


taaagtccta 


gggactgacc 


1260 


ttcaatgaga 


cttgcagact 


ttctatatta ttacaggtct 


aatgggaaag 


gcgtggctcc 


1320 


tattgccact 


gatcaaatgc 


caaggaattt catgggttga 


agaaacattc 


atgatttaca 


1380 
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ctcttcttgc tctggaggag cagagccctc catcactcta gctggaggtg tgggcttgac 1440 

atttcaaatg gaagctctgt gaggagacac attgcctacc tgctcctacc tacacacaat 1500 

ctatagtttg agggggctgc acacatccct ctttatcctc aggaa^acct ctgtgtgaca 1560 

ttcaggccaa atctcaagat ttgaacttag tcacagttaa aatgttcctt ctcccagata 1620 

aggttgcacc ctgccagggt ccaggaatta ctatgtggat accgaggcat ctatccatca 1680 

tcctggtgag cccgcagctc taatgcttca gctgtcatta agattaccca gatgtcagag 1740 

caaggctact ttggcctagg ggagttttgt ggctctctcc aatcaagtct gccctcccgt 1800 

ccagagctgg cacaaaagtc cggaatggta atgccagaca gccatgaaga acaggatctc 1860 

aatgtcctgt gaaagtacag cagctgtctg agtcatccga ggaaggtttg attaggttcc 1920 

atctcattga aagtgtacat tgttccaagt atttgaaact gacaatccat gggctacatc 1980 

tagtctttat ttggctcagc tcaaaggacc tgctatgtag aagtccttca ttcctaatag 2040 

gaacatttga ttattacaaa ccccatccaa agtggggttg cacacaagtg gcatccatga 2100 

ctttattctg gagtgatgca tacagctgcc tcaagccctc tgctattttt acctctcacc 2160 

ctcctcaata aagtcacact tttcttttgt aagcatcaat taagtgccaa ttgtgtattt 2220 

ctcactggcc tgtgatacat taggattcgt gttataacag agggatgaac atgcttttgg 2280 

ggcctctttt gtgtcttctt gggaagagct ggatttcctg gtaagataaa aaaaatgttg 2340 

acaaagaatc atggtacaag agtttactct ggaaatgtgt ggtgacccat gcagaagggc 2400 

ccaggctgga gcacacagct gtgtggctga agaggaacta agggcactat aggtgaaagg 2460 

ctttagcctc atgttcaaag ggccttgaaa agtcagctaa aggaaagcag agccaggaca 252 0 

agattgttta acatgggcag attctatttt caggtataat ttggcagggt gcatggaggg 2580 

atgctcagga taggacaaaa acatctgctg gaaaaagggc caaccaaggt gatggtgtta 2640 

ggggctactg tggatctgta gtatgtgaga cctccaggga gatctttagg tcattgggag 2700 

gttatccttg aagacaatta tgaactggag tcccaatccc tccttgcatc cttgtttgtg 2760 

ttgtaagtcg gtcacttcac cttggatttc tactgggatg tgctggcatt cttgaggctg 2820 

gaagcagcga ggctgccaag tttgtaagta aaacctttag gaccatgaga taaataatcg 2880 

ttttttcttt ctataagtta gttgtgtcag ttcagaaaca gggaatggat gagtgttaag 2940 

cattaagttg aaaatcaagt gtatgtctcg tgcctgcctg cctgcctgcc tgcctgcctg 3000 
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tctgtctgtc 


tgtctgtctc 


tgtgtgtttc 


tctctctctc 


3060 




tctctctctc 


tctctctctc 


tctctctctc 


tctctctgtg 


tgtgtgtgtg 


3120 


tgtgtgtgtg 




atacctacat 


aaaaaaaagt 


aaggaaaagt 


gtgacacagg 


3180 


ataaaatcca 


tgagtgccac 


tgaagggact 


agaagagact 


aagtctgagg 


tattggacaa 


3240 


acagctccta 


aacaattttc 


cccctcttct 


ccattgcact 


tttcagacag 


tccacagcca 


3300 


gcttcccctc 


tgacccacca 


gcctgcacat 


tattcccgtt 


tcctaaggaa 


tgtgcttgga 


3360 


tggagatgga 


ccacccctcc 


acctctgggg 


aaaagcccca 


ttatcttctc 


aaactcttag 


3420 


aggcgtttta 


cagcttaaat 


ccactagatt 


ccctgctctg 


cccctgggtg 


ctatatccat 


3480 


t 












3481 



<210> 87 

<211> 1469 

<212> DNA 

<213> Mus musculus 

<400> 87 



actttgggaa 


gatatcagag 


ctttgtgatg 


atcagtatac 


tgggggaaat 


tgtctatgct 


60 


taaatttata 


gtgctatttg 


gtttaaccat 


ggtttcacca 


acaggtgtaa 


tacatttctt 


120 


tttaagtaca 


tgtataaaag 


tagactttta 


tgacgaaaac 


gacatgcagt 


gccaagtgat 


180 


tagcataggt 


gtttaagact 


attaaattat 


agcagaaaaa 


ggttatgaat 


gatgtcaatg 


240 


gtaatagcaa 


tcattgaaaa 


attgtctata 


aatattagat 


attaccaaaa 


tgtcaatacc 


300 


gtagtttttc 


aagagtttag 


aattaatctt 


agtcatataa 


atttgtacta 


ttgataatta 


360 


tttaataatt 


tggaggagtt 


tgaacagtta 


agctggttgt 


aaactgtgaa 


tttctaattg 


420 


taaattgtca 


tttctaattg 


aaattttttc 


ajagaacagat 


ttcagcttca 


catactaagt 


480 


actgacaaat 


aaaaaagtga 


gaaatttagt 


atttataaga 


aaataggctc 


taaaatgtct 


540 


tatgcttttc 


tgtttctgct 


tagagtataa 


tggaggaaga 


cagtgtcccc 


tgacctaagt 


600 


gtttctaata 


aagagtaagt 


gtagagacat 


cccacaagac 


agcagcagtg 


caaggctctt 


660 


aggaagtgtg 


tgcactgttg 


ctgtggtgat 


gtatccaagt 


caccaggagc 


cgtgtcctca 


720 


ggagctttcg 


gctcttgtcc 


agagcttgag 


acagttgaag 


ctgggttcca 


tcactgagac 


780 


cagatgtgcg 


taggggctgc 


tagtgtgtgt 


gtaaactgcc 


cactgctgtt 


tccatgcact 


840 
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ctcagcccct 


tcagctgatg 


cccctggtgc cccactcaaa gcacagacca ttaagcaata 


900 


aacaaaattg 


tcattcaaat 


tataattgtg gttattctcc tgtgataaga ggtgctaatt 


960 


ttgtgtgaga 


agaacatgtg 


acctctttgg aaatatagtt tggaagtaaa gtttttcccc 


1020 


ctcttcttgt 


acttccattt 


tgtcccaaat tcacagttca tgaaagaaaa gtaaaattgt 


1080 


ttaggagact 


tctaagaaga 


gctgcttgca ataaacattt cagattttcc cgagtgtgag 


1140 


ttgggaaggt 


ctccatttcc 


tccatactga caaagcctgt tgttccaacg tctcttagct 


1200 


tggagagtgt 


ctctacactg 


agtctcacct cagcctgcca gtgaccctga ggtcactgct 


1260 


gaggaagaca 


caggagtctg 


cactagtctt tacaagagcc gttgtcattt caggtccccg 


1320 


ggggttcatt 


acagtacatt 


tatacaactc acaagagtgt gggattccgc ccctttgagt 


1380 


atttttatag 


ttaggtggat 


aactgccgca gatgcatgga ttgcacattt tgactctata 


1440 


atacttagta 


tgaaaaatta 


atttatgcc 


1469 



<210> 88 

<211> 3497 

<212> DNA 

<213> Mus ttnis cuius 

<400> 88 

aaatgaagga tgttttttaa tttaactttt tttttttttt aaagaaaaat atcagtaata 



120 



ggtcggcaac agcagccgta ggaagtacaa cttagggtag cattaaagca tactgtagtg 

tggatatatt ttttttcttt tttaaaatgt gatattgaca ttttattaat attttttaaa 180 

ttgttacgtt tataaatttg gtacttaagg cacagccagt gtgaggcact gaacgcgaca 240 

tttattacaa tgagctgctg cactcctact tttataaatt ttactaacaa gtagactaat 300 

gtagacattc acagacggga tagcgcagaa gcatcttcac acgacaagtc tccaaaaaaa 360 

aaaagctaat tcagagaggg cgggaggaag cctgttcccg ttctaattaa gtgcaacctt 420 

gttttatctt attgttttat tttttaagaa aggaaatcat gttctttttt tatatctcta 480 

tatataagat acagagttgt ttggtaaggt ttggggtgtt ttttgttttg ttttgtttta 540 

agtttccaaa caaccaaact aagaaaaatg ctggatgctt ttagaataca aaacttactc 600 

aagtcataaa ataacaaaat agaactttta ctttaagaaa caacaacaaa gagagagaga 660 

gactcgattc ccagcaagtc agtgtctcac aagggcactg gctaattact cttctgtccg 72 0 

cgttgctgat gtcctgtccc cagcccctgt ccatctcccc cacactccca tcacctcata 780 
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gaacgctctt 


tgttgatcat 


tgtctgttaa 


taatgtataa 


aatggctatc 


ttgtaagtgt 


840 


gctgtcttgg 


tactagtgta 


gtgacttttt 


ttctcctctt 


ctagtacata 


ttgataggta 


900 


taatgtaatt 


aaggttttaa 


aaaaaataga 


catagttatt 


cagattagga 


ccagtaagga 


960 


tagaactttc 


tcttatttat 


gaaaagaaaa 


aaaatgctaa 


taatttgggg 


gcagtttttt 


1020 


cctttttttt 


ttttttccaa 


gttcaaattt 


acttttattt 


ttgctgattt 


gatgtggttt 


1080 


caactaaccc 


aaggtctcac 


aatgttcaaa 


tggcggctga 


ctctacagca 


ttctgtaggt 


1140 


ccctgccccc 


ctagctgaag 


gggtgtgcca 


tactacctta 


aatgcaaaca 


ctagatatgc 


1200 


aaaactggat 


ttttttaatt 


tattttttaa 


aagagggagg 


tgtggtatat 


taaaatgatt 


1260 


ttactaagag 


aaaaaaaata 


tttttttaag 


gatgctcaga 


agaaattgat 


aatctgtgtg 


1320 


aatatgtttt 


agatgtttat 


atacattttg 


aagagaccca 


gtagcccata 


gcacaaatct 


1380 


tgtgaacatc 


tgatatgttt 


tcaagtggct 


acctaggata 


aggttcatca 


ttagtacccc 


1440 


cacccccacc 


ccaccgccat 


ctagaagtcc 


atcttaaaca 


attttttgta 


aattctttca 


1500 


gcactggtgt 


ccatctttgt 


ctttgtttca 


gttaagctca 


atagcgaatg 


tgggaccccc 


1560 


tcctctgacc 


ttccctgggg 


gagaaaccct 


cttggctaat 


ggctcttccc 


tggcattatc 


1620 


aataaccacc 


cggggactct 


ctggctttca 


gatccatctg 


cctgagaccc 


caaggtcctc 


1680 


tcctccctag 


aggggaggtg 


gagcagcagt 


tgactcctgg 


ttccctccct 


atttcagcct 


1740 


ggatgtgggg 


ctggtggaga 


tgcctccacc 


ccaggagcct 


ctgcatatgt 


ggttcgtggc 


1800 


cttcttctca 


cccattggga 


aaaccaaaca 


gcctcacact 


ctgtccccat 


cgctcattgg 


1860 


cttaactcaa 


gtgagacccc 


aactgggcct 


ttgcgttttg 


tttttgtttt 


tttttaaatc 


1920 


cttcatgacc 


attctcttaa 


tttgaactcg 


tagcttgggc 


tttaaggtag 


catggctcga 


1980 


ttgctgtcga 


cttaatgttc 


cactgcacag 


caattcacgg 


ccagtgaatg 


ttacacacat 


2040 


cttgctagac 


tagtataaaa 


atcattgggt 


aattgttggt 


tctaatgacc 


tgaaaggtgt 


2100 


tcagtttgtg 


ggtttttttt 


ggggggggct 


gcttgggggg 


ttgtttttcc 


ttctttgttt 


2160 


ttttttaatt 


tgagaattta 


gggggataat 


ttttgggggg 


gggttccaaa 


taaaaaaata 


2220 


gaaagctatt 


ttgatcttta 


gtgcaaacta 


gggacgtagc 


ctccatcacc 


acaaccacac 


2280 


cgcttcgctc 


tgccatccac 


ccaccgcagc 


agcatcttca 


agaataaagc 


aatatagttt 


2340 


actacatttt 


tttttaaaat 


tgaaggtcag 


ccatgctttc 


tgtattatat 


tgcatatgaa 


2400 
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attgtttaca 


aaagaaacac 


taactcatcc 


ttctctttat 


cagttgaaag 


tgacacacgg 


2460 


gacagaggga 


aattggaggg 


caggtgaggg 


aggggagagg 


gtttttttgt 


attttggttt 


2520 


gtttggtttt 


ggtttctttt 


cttgaatgtg 


ctttagtagc 


cagatcctcc 


aaggaaggca 


2580 


aaagccagcc 


agccagagag 


agagtagcga 


agcgggcagg 


ctgcaggtaa 


gcacactgaa 


2640 


caagagacac 


actggcaagc 


tcccccacct 


cagcgagagc 


agaaagcgaa 


cagctcagcc 


2700 


cagtcctggg 


cacagacact 


acacaaggca 


atgctggaat 


tgaaacaata 


ctttaatacc 


2760 


gtttaagttt 


gtttcctttt 


tttttttcct 


cttttttttt 


tctttcacca 


agaaaagaaa 


2820 


aaagtaaact 


aaaatacaaa 


atacaaaaac 


aaacaaacaa 


acaacaaccc 


atataaataa 


2880 


aacccacccc 


tctctgggaa 


acaaccataa 


ttcaaacatg 


gctatttagt 


aatcagaagg 


2940 


tattgtctca 


gacaggattt 


catttccggg 


aggcagggcg 


tgagggggga 


ggggggtggg 


3000 


actgagagac 


agttccagag 


cctccaggga 


aggcttctac 


tgctaactgc 


tgtattctgt 


3060 


atatactgtg 


ccaccctgtg 


tggagtctgt 


gagtgtgctc 


ttgagtagcg 


tgggctagcc 


3120 


aatctgccat 


tcatggtgtt 


ataaactcgg 


aattccatat 


gtaataggat 


gcaagtctaa 


3180 


gcgtttcatg 


tggacataaa 


tgtatctaaa 


taaaacttcc 


cctagcactg 


tggctgacct 


3240 


cacccttact 


tttatacttt 


agtatgaaac 


tgatgagaac 


tttggtagtg 


agtatttttt 


3300 


ttatatatat 


acatatatat 


gtatctatat 


atatatatat 


atctcaagca 


tctttcaggt 


3360 


ctttgtgtgt 


ggctttctta 


aagccctgtt 


gtaaaaaaaa 


aaaaaaatta 


ctaagtggat 


3420 


ggcagtctct 


cacatcacag 


atgtggaaag 


tataatttta 


tatttgtatt 


ttcaaataaa 


3480 


taagtttgtg 


aaaggct 










3497 


<210> 89 

<211> 4277 

<212> DNA 

<213> Mus musculus 












<400> 89 
tttcaacaaa 


cattttggct 


caggatacaa 


agtttaaact 


cctttgctag 


agaactttga 


60 


tagtggtatg ggatgagttg 


atcattatag 


ttttgtataa 


tgaatattgc 


acatgtatat 


120 


aaattactgg 


tctggggcag 


caacaggtat 


acagaagcat 


gagcctttag 


tgggcagtta 


180 


gctcagtggt 


tgagtgctgg 


tggagaatac 


atgaagcaaa 


gcactgggtt 


ctcacgggag 


240 
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gagggggagg 


atctgcaata 


ggagggtgcc 


attaactggt 


ttatctgtgg 


gaaagacctg 


300 


ttgtcttgga 


tgaactgagt 


taactggcca 


tgtggaatat 


tgcagaaaat 


aaagctagaa 


360 


atgtggtcat 


taccaagagt 


gcagatacgt 


gagcataaga 


accatcttgt 


ttagactgcc 


420 


aaagtattca 


tgactttgta 


ttgtctgaag 


atctcattgt 


atgtggtggt 


gttatgggtc 


480 


tgtttgtata 


tttcttctta 


atgggcattt 


tagcaagttg 


ctcacactgt 


tcaatttctc 


540 


acccttttgg 


agaaatcaac 


attagtctgg 


ctttaaagcc 


taaaacatac 


catctaatga 


600 


cagactttgt 


tagagaaaag 


cttagaagtc 


agctaatatc 


tgagtgtctg 


cctttgggga 


660 


gcaagcagct 


cctgtgtctc 


ttatatcaaa 


accttccaca 


ggaataatgt 


gttctggctc 


720 


ctcattgtgc 


atgacgcccc 


tcccagataa 


gaaacagtct 


ggttttcagg 


ttctcattct 


780 


gacttttgag 


agcacaagat 


aaagatctta 


gctctacttg 


gataattgac 


tatagaaatc 


840 


agaattctca 


gaaatcaact 


gctttagaag 


gttaaaatgt 


gaagtttctt 


gtttgttttt 


900 


tgcttgtact 


gaatattttg 


aattttaatc 


tacctcgtct 


cattttttgt 


agttacttct 


960 


gtcattgcca 


cttgagcatg 


aagcttgact 


ttggtcgttg 


tctaaaatgg 


gatctctcct 


1020 


gttcagtgac 


agttttttct 


cttgcctttc 


ctcttggccc 


ccatttcacc 


taacatcatc 


1080 


tcttattgct 


cctgtcagaa 


atatgtttta 


aagtcttgtt 


ttatgggaaa 


aagtggttca 


1140 


ccaatgctgt 


ttgtctctga 


ggtcagtgag 


agaaggagtt 


aaaacacagt 


ggtgaaggaa 


1200 


gggcagatcc 


tgctctggag 


gcaaaagctg 


acaaggaagg 


agtcattaat 


tacagacaat 


1260 


ttcaaagtca 


actgattgtg 


atcattaatg 


tcacaataga 


tcaaaaccta 


attatcacag 


1320 


cctaggtaag 


agccatcagt 


attaatcgga 


aaatctaata 


ggaaactatt 


accaagaaat 


1380 


taggccaaat 


tgaatgcaat 


gaacttttta 


tcatcttttt 


ttggtaatgt 


tcttggtgtt 


1440 


ggcaactgtg 


accctaacag 


ctggtccaag 


cccagattgc 


taatttcact 


tttaaagagc 


1500 


ctaggctggc 


cctagtgctt 


gtagtactcc 


tgcctcagcc 


ttaaatactg 


ggattatgag 


1560 


gggatgtgtg 


cattcctgtt 


tgtgtgggtg 


ggtaggagag 


aattgaactc 


agggccttgt 


1620 


acattttaga 


caagtgctca 


accactgagc 


tgtattccta 


gttcagagtg 


ttttgtttgt 


1680 


tagggaagtt 


tccagagaga 


ctgaatctga 


gctaacattg 


ctgaaacacc 


aaccttgaat 


1740 


tttctcatca 


tggatttgtc 


cagtctaatc 


agatcagtgg 


gatatacagg 


tatatattaa 


1800 


agtcatagag 


tgtccctggt 


gaacattgtt 


gctgtatcat 


ccgttgagtg 


cttgtgtata 


1860 
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gagctctgta ctggctgctg tgaaggctgt gggaggaaga cagtgtcctg ctttgaaatt 1920 

ctttgagaac taaaaagata gtttccttgt atgacaacag gaagaagatg caagccagtg 1980 

ttttcctgtt tgcagtaacg ctaggatttg atcctgaaca gcctggcagt gtgcctgatc 2040 

ccagtagttc taagatagga acaaaatgat ttcccactgg aattctaagg gatggggaag 2100 

atatctttgt tttcctttct gtgttacaag ttatatctct ggtagtagga gaagccatca 2160 

gtatgtcctt aagatgtttg taaagatgct accacctttg ggtgagatga gtcaggcagt 2220 

ggaaaagttg ctgtcctttg ggtgtcacag gaggaatgcc aggcagtagt agagggaatt 2280 

gtgcctagga gtctacactg tgccctgcag ccctgctgca cttcaggtgc tcagtgctaa 2340 

tcttacttag gaaggccatg ctcctttctg gaacttggat tacccactac aagactgaga 2400 

atcagtgtct tttgtttgtt tggttttttt tttttttttt taagacaggg tttctatgta 2460 

gccatggatg tcctggaact tgctctgttg accaggctgg cctctaactc agagatctgc 2520 

ctgcccctgc ctcccaagta ctgggatcaa acatgtgtat caccactgcc tggctaaccc 2580 

ttcacttctt aagtgagttt tagtgtttaa aatgcaactt aattttcaaa atggcactaa 2640 

ggtggtttgt ccagccttgg tttactccag atgagcagca gccctcaaca gtgtgttgat 2700 

tgaactcatc tggctgcagt gaaatggctg aaagtgagat tttctagtgc atacttctgt 2760 

gotagcctcc ctacttggtc ttaatggctg accttgggca agtctttcct ttcaatgcct 2820 

cagttcccca tctgtcaaat ggggtgataa tactgaccta cctcatgggg tgttgtgagg 2880 

cattacagat cagagctgat caactccttt aggattgcct gtggaggata ccttcagtcg 2940 

aagttttttt agccctcact gagctacttc caagtttgga agtacttagt gaccttggtg 3000 

agtttctgtc cccaatccct taacccctca tcaatgctgc tagttcagat gttgacagtg 3060 

ttttgtgaat gttggagtct tgatcatcca gtctaacttg ggatgctgtg agggaaggca 3120 

cttgtgggct ctgtgtcttg catcacaaag agacttagaa attcaagagc cttgggttag 3180 

ggaatcttga ggcaggagtg ctacttcatg tctttctctg caggcaggag ttaaggtcct 3240 

attctcatgg atgacttgtg ctgattgtct tggtagtgtg gattgatatt gttgcttcca 3300 

ttttccaggc tgacctagga caiggtcgtcc actgaactct tcatgctccg atgcctctgc 3360 

cctctcatag aagccagacg gctcaagcaa cataacactc ccttccccta ccaaaaagag 3420 

ctacattgta ttttgttttt tttctagaaa taggtatacc atttcagaat tgagcattgg 3480 
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tcctcaaggc 


ttcttttcag 


ttcactgtgt 


gtggaatctg tcacctcttt tcagtgtata 


3540 


ccatctgacc 


cactgaggac 


atcttggaca 


cttgtttgcc cacctgaggg ttcaagagcc 


3600 


tttcttgctc 


tgtctcagta 


tagggactaa 


ggcatcccca gattcttcat gataggtttc 


3660 


cttttcccct 


acttgacctt 


ctattggtac 


ccatgaaaaa gcacaaggtc ctagctccct 


3720 


gtagatacac 


actccagttc 


tgttgggtag 


actctactcc taacaggggt cctagagggt 


3780 


tctctggagg 


atgggaaagg 


aacagtctca 


tgtcctgcca aatagtagcc agcagttgcc 


3840 


ttacccctgc 


ctgctggcca 


tgaacccaca 


ttctcacgga aagggctttc agagacccct 


3900 


ggtttttctc 


agcactgtac 


cattttgtgt 


aggaagcagg tcacaccttg tagccaggtc 


3960 


taaatggaag 


tgagacagtg 


aattctctgt 


ggactttttt ggtttgtggg ggggttttgg 


4020 


ttgtgtttct 


tggttctgtt 


tcagccaaac 


ttgtctgctt ttgatttctt taaaggataa 


4080 


gtggtctatc 


tatctatcta 


cacgtatgtt 


tgcattttaa gtgtaagtgt tttgagaaaa 


4140 


aaaaaaaaca 


gacattatgt 


cattctacaa 


ctgacatcca attcagacat catcctctcc 


4200 


cccttccacc 


ctctttttcc 


tcttttccct 


cttttcatac tcttgtattg gttctaataa 


4260 


acgattgctt 


ttcaaat 






4277 



<210> 90 

<211> 3887 

<212> DNA 

<213> Mus musculus 

<220> 

<221> modif ied_base 

<222> (3295) . . (3295) 

<223> a, c, t, g, unknown or other 

<400> 90 

tagcagtgac aggccatttt tggtatgctt aggattactc acatttgttg cagcatgaga 60 

gtttggaaga attggaagca ggtccaatct ctaccttatt acttatttta atcaattttt 120 

aagtctttgg cttttgcagt agaattccta gaggtttttt tttttttttt ttagtatttt 180 

tatacattga tataaaaaag aaagaagcca gaaggacaag tattattttt ctccctacaa 240 

aaggaagaaa ttaagagact atagttgtca gatacaaatt tttaatattt aattatttct 300 

gattattttc taaaagaaga gaaggggata atggctggta agacttgata ataaaaccaa 360 

atgtagacaa aattatcggg gtaggagaaa agtgaaaaag gaaaccagga acaaagtagt 420 
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taatagaaat 


agtatctccc 


tctttacatt 


ctctcatcca 


agtgttgtgt 


attttaaaac 


480 


tgttaatctc 


tcaggtattc 


tgcaggtcaa 


aacacatcct 


tgcttggaaa 


aaacaaagtg 


540 


tcaggtaaga 


cagctgtaga 


gtaggtacca 


catatgacta 


gggttcaacc 


actgctcata 


600 


agaagtctcg 


atgacataag 


gataaatata 


cagtgttctc 


cgtggtgagg 


gtaactcggt 


660 


ttccatcctg 


gttacactaa 


gagcaagcat 


ctaagtggca 


tctaatgggt 


cagaattaga 


720 


gctttctgtc 


ctcattttgt 


gttgttagtc 


acagggaaag 


ccttcctatg 


ttaccagctg 


780 


gttggttcac 


tcctttgtct 


ccttgtactt 


atcttctaga 


cagattctga 


aaaagagaat 


840 


ggaagaaaag 


gttttaaatt 


gttttctggc 


atctggcaaa 


ggacttgagc 


atgccttttg 


900 


gctctgatag 


gcacttgtaa 


gcagacatct 


aatgaggcta 


ggctagtgtt 


tttatcctct 


960 


aatcaggctc 


tatcttactc 


tgtcttgctt 


tctaaatcct 


atctctaaac 


aacaccaagt 


1020 


gtcttatctt 


tggatgggag 


gataagtgaa 


ttagaagtcc 


ggcctcagtt 


tgggaagagt 


1080 


tctcaatatg 


aaaggttaat 


aacaccattc 


ttacagaaag 


aaaaaaatca 


tccattgccc 


1140 


aaattgggaa 


gccacagaaa 


atgcttgagc 


taaccaaaca 


tcccataatc 


gattttctat 


1200 


tcacttaacc 


agtctattca 


cgcaactagt 


cactagctat 


gttctattta 


aaacagccac 


1260 


cactaatcag 


aatttcactt 


tctcacctga 


gctaacatta 


ttcactgatt 


ttggttggca 


1320 


tatttcctca 


tgcctgtggg 


ctcctctttc 


tggctctctg 


aaatgatgat 


cttgtttgta 


1380 


ttacagagct 


cttatttgga 


agacagtgaa 


tcagcagcat 


tactttgctg 


tgaatatgga 


1440 


gaatcagaaa 


tctttagtga 


tttcaatgta 


cgtaggtata 


tgtttttata 


cacacacaaa 


1500 


taacatacag 


cccttggtac 


tcatggatat 


tttgttattg 


tttctaggga 


gctttattat 


1560 


gctgtagtac 


atttactcag 


ctattaaaaa 


caatgaattt 


atgaaattca 


tagacaaatg 


1620 


gatggatctg 


gaggatatca 


tcctaagtga 


ggtaacccaa 


tcacaaaaga 


acacacatta 


1680 


tatgcactca 


ctgataagtg 


gatattggcc 


cagaattttg 


gaataccaag 


atattaccaa 


1740 


gatacagtcc 


acataaagct 


caagaagaag 


gaagaccacc 


atgtagatac 


ttcagtcctt 


1800 


cttagaaggg 


ggatcaaaat 


acctatggga 


ggaaatagac 


aaagtgtgga 


gcagaaactg 


1860 


agggaaaggt 


catccagtga 


ctgtcccacc 


tggggatcca 


tcccat.atac 


agtcaccaaa 


1920 


cccagatgct 


attgtggatg 


tcaacaagtg 


cttgctgaca 


ggagcctgat 


atagctgtct 


1980 


cctgagaggc 


tctgccagtg 


cctgacaaat 


tttgtttctt 


ctaatacata 


tctgtggcta 


2040 
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aaattagtcc acatagctga gttctacatg tctcagatgc ttattgtagc gctagcactc 2100 
atggaaatat gtgttggtgt tacttctcag gtataagatt cagcctcttt accccttttc " 2160 

tccatggtac agttctagga cataattgta ttgcctcacc tggttggatt tcattccatc 2220 

ttgatgaaga gtctggaatg ccacagattc aattattgtg gctcataaat agctggagtg 22 80 

gtgctcattt ccctaatgct taccgacatt aactgatgaa tctacacaaa cactctttta 2340 

tcatgacact ttatagatga cagactgagt gtggaaaggt tataagattg ttcggtgttt 2400 

gtgatacgca gggacagcac tgacgtgcac atttcagctg acagctatca gtgagctcat 2460 

taccagagag cagagtcatt catagattat gtaccaggct cctatatagt atactatgag 2520 

attttattca gtgtgttttt taaccttgca aataattatt taggtacttt tacttcacag 2580 

gcaactaatt tcttttcctt tccttcctgc ttgtagccat ctcccccaac tgcctcccca 2640 

gtagtatctc tctaggtcac tcaggctatc tcagaacttg ccatgtagcc taaactggcc 2700 

tcaagtttgt aatcttgctt tcacctccct agtcctggaa atactgacaa tcattttata 2760 

gccagaattt ttattcattt gaaatgtcag gtttcactct tgccacgtag gcctgatgac 2820 

ctcagtttga tccctatgtt tcttagtgaa aggagagaac taactcctga aagtcatcct 2880 

ctgaatctat acataggctg gtacacacgc acgcgtgcgc acacacacac acatacacgc 2940 

acatacacac acacacacat acacacacac acacatacac acacatacac acacacagac 3000 

acacacacac acacacacac acggcaagtt gactgaaaat tttttgttat tttcctggtt 3060 

ttttgttttg atttttgagg cagggtctca tgtatccaag gctgacctca gactcactgt 3120 

gcagccaatg atgattttga acttctgatc cttctgcctt taatttccag agtgctcaga 3180 

gcacaagcat gcaccaccat acttgctttg tacacttctg gagatcaaat ctagggcttt 3240 

gtgcatgcta ggcaagcact ttactactaa gtcacaaccc tacccgtgga ctacntgagg 3300 

ccctgcttct ctatctttgt ttaaaggttt tgggtccatt ttgttcactc agggagtatt 3360 

ttcaacctct acctctttct gcctattagt actccaatat ctaaggaaat ttgaagaggt 3420 

cctttctaac ttaaagtctt gggatttagc tcagtggcag gatgcttgtc tagtattcac 3480 

aaagttctgg ttcacttcca agtactgccc attcaagcat acggagtaac agtaatctga 3540 

agatttagca tagagggtgg tggtatttgt agagcaaaca ttgaaatcct cattatttaa 3600 

ctttcttcaa aggaagtgat atgggatctt gataaagaag tggaaatact catgtaaaga 3660 
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ggaagagtag atgataaagc 


caattactga 


gaacctgtct 


gtaatcttgg agtaataata 


3720 


gtgtgaactt 


tggctagatg 


cactcattca 


ttcagcaatc 


attgattgac tattatgtgt 


3780 


aagactactg ggtaaagagc 


tatgtagttt 


ggacataaaa 


tacagaatct gtgcttagag 


3840 


aatgtataat 


caattgacaa 


gttagtttta 


tggatggata 


tgaaagc 


3887 


<210> 91 

<211> 2219 

<212> DNA 

<213> Mus mus cuius 










<400> 91 
ggccagttaa 


cttttctggt 


gccaatgtga 


actttaaatg 


tttttatttt acctgctttg 


60 


tggatgaaaa 


atatttctga 


gtggtagttt 


tctgacaggt 


agatcatgtc tttttatctt 


120 


gtttcaaaat 


aactatttct 


gattttgtaa 


aatgaaatat 


aaaatatgtc tcagatcttc 


180 


caattaatta 


gtaaggattc 


acccttaatc 


cttgctagtt 


gaagcctgcc taagtcactt 


240 


tactaaaaga 


tctttgttaa 


cccagtattg 


taaccatctg 


tccgcttacg taggtaaatg 


300 


tagacgcctg 


gtgtatatgg 


cttgtagttt 


tagtgttggc 


tctccgtgtt gagattcttt 


360 


ctcagtgtca 


ttctgtgtcc 


tacaagtttc 


atgtaggttt 


cgatgttagg atggttaaga 


420 


tgtatatagt 


acaaaatgtt 


cagtctttga 


ttgttttatg 


tttgtttgtt ttcatgacta 


480 


gtaatggtag 


tggatacttt 


aaaaatattt 


tctgaagatc 


cttaaaacct tggaaattgt 


540 


aacatactga 


gaagagtggt 


tgtttgaata 


tttttaacaa 


cttgtataag tcaatatgaa 


600 


tacaatcaaa 


agctaaagtg 


cttcatagaa 


cacgggattt 


accttaagta attatcatga 


660 


gagacctctt 


gtaaagactg 


gtctatttta 


ctacagagaa 


cagtttgaga gtgaaactgt 


720 


tacaatttag 


actttttgtt 


gtattttcta 


agagaaagag 


tattgttagg ttctcctaac 


780 


ctctgttgac 


tactatggta 


agtgatgtta 


ttttaattgc 


aaatttaaat agaaaccaac 


840 


agaattaagt 


aactgactgt 


ctctctctct 


ctctctctct 


ctgtcttctt catgaaaatc 


900 


cttagttctg 


tatactgcct 


tctgcttaga 


tttagttata 


tgatcattag gttacatttg 


960 


atctaagttg 


actaagattt 


caatttcaat 


ttatatttca 


agcattctgc cagggaatag 


1020 


ttttaaaaat 


tgtttccaag 


gcacttcagg 


tacaagtcat 


ataggtagtt tgtttaatct 


1080 


agtccttagc 


ctaggactca 


aggactgaat 


gttttcaaat 


aacacttttc ttgttctcag 


1140 


agcctcagtt 


cattaaaact 


cttttaaaac 


ctgtgtgctg 


agagttaagc aagacctctt 


1200 
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ttcttgtctt 


catgagttgc 


gaaattgaat 


tcatggagct 


gatgtggcta 


acaagtttat 


1260 


tttaggaatt 


gtttagaaaa 


tgctgttgct 


tcgggttctt 


aaaatcacag 


cactccaact 


1320 


ctaatcaaat 


tattggagac 


ttagcagagc 


tggtctgaaa 


tcacactaaa 


aaaaaaaagg 


1380 


tactggatcc 


tttccagtat 


ggttgtgtaa 


aaaatggtca 


ctcggaaggt 


cacagaataa 


1440 


ccggctctga 


atcaccaccc 


ctgagagagt 


tcagtgtctg 


cgtgctgatg 


gtcttttgct 


1500 


gtataaagtt 


cattctagaa 


ttttgaagta 


accacccgtt 


ttactacata 


ttagtcatgt 


1560 


gaattttgaa 


gttattttgt 


aatagttact 


ttcatttgtt 


atttactgtg 


ttgatcaaga 


1620 


tgtgtgtgag 


tgtttagcat 


ttattaaagt 


attgtctttc 


ctcctagact 


tttgcctctt 


1680 


cctgtagtta 


ggggtagagg 


tgaggtgagt 


ggagtgacta 


caatgtgatt 


cgctggggta 


1740 


ggacagtttg 


ttgtcttgtt 


agtcacagtg 


tgttgctctg 


ttttttattt 


tagtaagatt 


1800 


tatattttga 


gctctctgaa 


tggaaagtgc 


cactgcaacc 


ttagctgcat 


gcattgttcc 


1860 


cattagacct 


aagtcttcct 


ggacttctga 


gtccccagat 


actggataaa 


ggaagaggag 


1920 


agatggagta 


gcctgtctgt 


cactgctatg 


agagaagcaa 


gaaagtgggt 


agagaaatac 


1980 


ttcacatttt 


agtcacatct 


tctcttttta 


aaaaggtagc 


catcagtgaa 


ggaaagaact 


2040 


agtcaaaaat 


aggcttgaca 


acctgaattc 


agatccccag 


aacgggcata 


aaagcagcca 


2100 


ggtcatggtt 


agtgcactcc 


tttaatccca 


gcatttggga 


ggcaggggta 


gacatatctg 


2160 


ggtttgaagc 


cagcctgatc 


tacagagtga 


gttccaggac 


agggaaccct 


gtcttattc 


2219 


<210> 92 

<211> 3433 

<212> DNA 

<213> Mus mus cuius 












<400> 92 
ttttataaac 


agctcaatat 


aaaacataga 


cagtgtttga 


ttattttcct 


tgtgtaagtt 


60 


tgatttaaaa cgttggaaat 


gtgtgctttt 


tagtgtttac 


taaagtgata 


agaaaataaa 


120 


gcattcaata cactatagat 


tccaaaacat 


aacattgcac 


caaatagaaa 


tgtatatttt 


180 


attatgcaat 


gccttagtca 


taaactgggc 


tcaaacaatc 


ctcagcctaa 


aacactgttg 


240 


tcttttaata 


tgcttccaac 


ccaaaggcct 


ttcatctcag 


tatctgtcaa 


acttgaataa 


300 


cgtcttctct ttactattac 


acacgaggca 


gcctattaac 


ctgtgtctta 


gaattgttgt 


360 
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atatagttct 


tttaatatcc 


atagggtata 


tttttgaatt 


ttttggtgag 


tatttgttga 


420 


atttgagata 


gcatacagta 


gatgtttaag 


aaatagtagg 


aagtccggtg 


agacggctgt 


480 


ttccatcaag 


aactcctagc 


actgctgtaa 


atatcatggt 


gcctactgga 


agggaatgta 


540 


gatgctatgc 


attttggaaa 


taatctgcat 


ctgttaaacc 


tgcagaagtt 


ttttaatgcc 


600 


actttaacac 


taatgcactg 


acagattcta 


aatattttgt 


gagaaatgtt 


gaaatgttta 


660 


acctgatagg 


cttctctata 


aaagagtgtt 


ttgttttttt 


ccctgcacca 


caagctgtgt 


720 


ttatcacttt 


acagttgcat 


gttcaacttg 


tcatagctgg 


aattactgta 


taaaaagaac 


780 


tgattgtgac 


ttgtagttct 


tctctagagg 


atgctgctag 


aacgtggttt 


tgctttgcat 


840 


tttgtagttc 


ttcccgtcag 


tgctgtgtgt 


agtcctgctt 


cttccagtct 


gcagtattca 


900 


ctagaggcgc 


cctttgcatg 


ttgcactctg 


ttctcatttg 


gaggttggac 


tcagaccagt 


960 


tagcacagta 


ttctctcatc 


tgtgtcactt 


tgtaaaacta 


actgtactct 


gtatttctta 


1020 


tttgtacata 


tcaatgtgag 


aaatctccct 


tttttatgtt 


gcaattacct 


tgtgatcagg 


1080 


cagcttgagt 


gctatgcaaa 


tagtaagtag 


tgtagtggtg 


atttttcttt 


gcatgttgtg 


1140 


tgtgatatac 


ctagccagaa 


atagatgtgg 


cttttgtttt 


gggggcagat 


tactttcaaa 


1200 


agcaaataca 


attcacttga 


atttgacaaa 


ctgaagcaga 


caagtgttct 


gggtcctctg 


1260 


ataatttggg 


gtgtttggct 


gtcagctagg 


ctcatgaagt 


ccactgtact 


gtaatgatgg 


1320 


tatttacctc 


tgtgctattt 


taattaccct 


cgcgtctgtg 


gaactgctga 


tttgagtagt 


1380 


gattagcatt 


tagaaatttt 


gtaatgtaga 


gttttagaga 


gagcactttg 


aaagataaac 


1440 


attttattat 


gatggtgcta 


ggtacaaaat 


ttatagcatg 


ggatgtgaag 


aaaaaaaatg 


1500 


agaacccatt 


gaaaggaaga 


aaggaatttg 


ttgtctgctt 


ctaagctaga 


gtggttgtaa 


1560 


aggttctgct 


ctgccagtgt 


tcagtatcag 


tggctgaatt 


atggataaga 


actgtagaga 


162 0 


atcttctgtt 


tagtccatgc 


tttttatgta 


gaattggttt 


tatctaatag 


ttttactatg 


1680 


gaaatcgcct 


tttgatatta 


aagccagatt 


ttagaggttt 


gatatgtttg 


gtctcaggag 


1740 


ctcaaaagaa 


gtagcttttt 


ccagtgtctt 


ttgtgttact 


gatttgggaa 


atgttgaaag 


1800 


attggacagg 


gaagaatagc 


gcttggtgtc 


ctcatggtca 


ttctgtctta 


ccttagtggc 


1860 


ttgacagtac 


ttactatagc 


tcctgaggga 


agccaatcaa 


cttctgtttt 


cctacctgac 


1920 


ctgcagggca 


tgatggatca 


gtgatgaaag 


aattgtagcc 


tgtggcactt 


tgttttgacc 


1980 
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tctggtatag 


aactctgact 


tttatagttt 


taaaatgatc aatctttgta tgaaagtcag 


2040 


ttttctttct 


ttaggtatca gaagattttg 


ccttattctg aggtcggact agggccaagc 


2100 


aagctttttc 


cttcatatga 


gctgtcatac 


tgtattttga cctctttctg cacgaaaaaa 


2160 


gaaagaaaaa 


tgacagctgg 


aaaaagtcct 


ttgtattggc acatgataaa tcttgatgac 


2220 


gggctatatt 


ctgaacaaac 


gcttgcactg 


accaaagctt gtgtttagcg tgtgtttagc 


2280 


ataggtgctg 


tagttggagt 


gggccccacc 


tctgctcctt tcagtgttct tgctgttcaa 


2340 


gttattcctc 


tcagacatcc 


tccattgtcc 


cacgcactcc ttggcagcag aatgctggca 


2400 


gcttgtgacc 


ctcatgtttc 


ctgtaagtct 


gtttcatggt agggttcata cctgcctcag 


2460 


cctgacagga 


gtaggttttc 


cgcaacccac 


tgtgctgaga cagttgatgt gatgcaaact 


2520 


attcttggag 


tgtgttgttt 


tctttggccc 


aagggtaaac taaaatacct gtgaactgca 


2580 


aacttgcttg 


tgattgactt 


ggtgcaatac 


agttcctttc taaacgtttc tggtttagag 


2640 


aatactcagc 


catcctggaa actagcaata 


tttattttgc ctcatttact ttattttcag 


2700 


ttttagatga 


ggtgaaggtt 


ttctttcagt 


tagggtactt ggcaaggtcc attctagtgt 


2760 


agctgaagag 


gaagctatac 


tattttctgt 


ttactgtatt ttgctcccta ttttatgctt 


2820 


tttaaaaatc 


tttcacagtg 


actcacttga 


ttgagcatat tgatttgaaa gagctaagat 


2880 


tttcctgctt 


aatgtcagcc 


attggctctc 


ctgataattg ggactttttg tttgtttgtt 


2940 


ttttgttgtt 


ttgttttttt 


tgtttcccct 


ttccctaaaa ttataagttt agagaatcct 


3000 


gtaagagggg 


tgggtgcatg tcttaatgct 


gagaagcagc agctttgggg ttgaatttac 


3060 


ttgaatggtt 


taaaaagatt gcattgttac 


aaagctagaa aaaatttatg tatgaaaaat 


3120 


gatcctagcc 


ttacactgag 


tacaataata 


cttataagca tgtgaagatg tgatctttgt 


3180 


gatgttactt 


gtacaaacat 


atatacatat 


ctatacatat cttttgaagt gttgaaagga 


3240 


atagtacttt 


atattcttta 


tcataccact 


tttgtaagtg ttgaatatat tgctgtttat 


3300 


ttttctatgt 


tctgtttcgt agccttaaga 


agtgtttcaa acgatctgaa tgtataaaac 


3360 


aagtcaatgc 


cctacatggt 


gtgatgctgc 


attatatata caaccgtgtg catatattaa 


3420 


attctgtttg 


teg 






3433 
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<210> 93 

<211> 2228 

<212> DMA 

<213> Mus musculus 

<400> 93 



ggtgtccgcc 


gcccgcgggt 


ggggcagctt cttcaaggct cagcctccct cctgccggct 


60 


tccttgtgct 


tttttttcct 


ccaacccttt ccgcgatcac ggttttctgg tacccaatag 


120 


aaatgggcga 


cagcgaaccg 


gcctttaacc tttccttgaa ttaatgagct tcaggttcag 


180 


cttcctccta 


ggtttttttt 


tttttttctt ccttgttttt aatgaccctt cccaactcct 


240 


tttcctatcc 


ctgagagctt 


cagttacttg caggaagcca gcgctttccc aagctctgtg 


300 


tagcaggtcc 


ccctcaaccc 


ccacccctgt ctcagccaat cgcaactgtt ggacgtcgta 


360 


gcaaatttgt 


tggggtctct 


tggcagaaca agagtctatc cgagccgcgt tagatactca 


420 


gatgacaccg 


aggtccgtag 


cagatgggtg gactgagaag gtacctgcga gctcccagcg 


480 


gaattcgctg 


gcagctgcgt 


gaggctctgg tgttttggtt tctttgacat tttcccagct 


540 


tcaaagcgga 


gaacgggagt 


cagggattgg gcgaacttca acgtgttcgg ttacaaggta 


600 


attagattaa 


tgagagtggt 


gagtttttat gcagtgcaat tatgccacga tttcttgttg 


660 


ctcggtggct 


tttccacaag 


tttccagtct attaaattaa tccaatggta ggtctgtttc 


72 0 


tacctgactt 


gtcctccaaa 


acctggggca aaccccctcc aactatctcc aaggaaaagg 


780 


gctagtaaac 


ggagacaaaa 


gaaagcactt gtggaggagt tgacgatagg atcccatttg 


840 


tcatcagata 


gtgtgaagtg 


gcatctcctc tggagagtgg cgtcagtgct cctaagagcc 


900 


acgttacttt 


ccccctgcct 


ctgccattct ccaiaactgtt gaggactggc atttaaaatg 


960 


cacttagcgc 


aatctaaatt 


acagtttgca acttcacctg ctacattacc cagaatctct 


1020 


aattaggaat 


ctcctatcaa 


ctacaggcac attaatattc cacatgttac agtgtgctga 


1080 


tcaatgacca 


aggataattg 


gggggcaagc cttaagtctc cccatcttat tcattagaca 


1140 


cactgtggat 


tcctcaaacc 


accagatccc taatgaagga gacctgtctg ttttcactgt 


1200 


aatttttcat 


tacttattac 


aggccatgat gattcaagtg gtaaacacac ctttcactac 


1260 


tcaagatata 


atccaagcct 


tatgtctatc tttaatatgc ttaataatag cctcagagtt 


1320 


gcctggagaa 


atgacttttc 


atggggggag ggctggaaga atctgacagc tttacataat 


1380 


ggtgtatact 


aaatgctatc 


aggacaaaca tttttaccat cttgcagtta ctacatataa 


1440 
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tagacgaaac 


acagttcaga 


aaatattaat 


aagcatcttt 


cttttaaaga 


gtgatcacat 


' 1500 


gatgctcttc 


ttaaaagttc 


ttttagcttg 


gagctgtaga 


gatggttgag 


tggttaagag 


1560 


cagtggctgc 


tctcccaaaa 


gaccttaggt 


tcacttctca 


atgcccaccg 


gcaactcacg 


1620 


atagtctgta 


actccagttc 


caggggattc 


gagacccctt 


cacgttcaca 


ggcaccaaac 


1680 


acacacatgg 


tgcacagaca 


cacatgcagg 


caaaatgttc 


ttatacaaag 


attttaagtg 


1740 


cttttagata 


gctgattgac 


agcaaagtgt 


agagttttct 


tacctcaata 


tttaattata 


1800 


tccccaaagg 


tacattaggc 


tatcattaca 


tctcatataa 


acaaaaggtt 


aaagatcagt 


1860 


acattcaatt 


tggagtcttt 


ctgctgatga 


cttcgctttt 


tctgcactat 


cttgtgtgta 


1920 


tatatgtatg 


tatatatata 


tatatatata 


tatatatata 


ttttaaatac 


aagggggtct 


1980 


agatattctt 


atcagtagag 


ggcaccctcg 


agttgcttca 


aatcctgata 


tcaaggaaaa 


2040 


aaaactgggg 


tggtggtggt 


gggggaggag 


aatcatcttt 


aaggtgtttt 


actgcacctc 


2100 


tgttgtaaaa 


tttgacaagc 


ttaagacatg 


tatgcctaac 


aattcacttt 


ttattacagc 


2160 


gaagtgttca 


gctgctagag 


gaaaatgcta 


ttttttttaa 


aaataaagta 


caaggtggtt 


2220 


cctatttt 












2228 



<210> 94 

<211> 2543 

<212> DNA 

<213> Mus musculus 

<400> 94 



acagaacaaa ctgcattaga 


atgcaaggca 


ataaagcaaa 


aaactagaaa 


catcctggac 


60 


aaataaatac tagcagttta 


attaaaacaa 


agtagtccaa 


catgtgcctc 


agaaatagtt 


120 


aagttttaaa gcatatgttt 


ctgccaatgt 


accaacacaa 


gatggactta 


ataccaaaaa 


180 


gaaaaaaaaa acagttcaac 


ctttttatta 


aaaaaaaaaa 


aagaaaaaga 


aaaagaaatg 


240 


acagcaaaat ccctaaaaaa 


aggataatta 


acttttcaat 


ggcgactata 


ttacatacgt 


300 


ggacagtaat ttccaaagag 


gcacactaat 


gggggcacga 


agtctgctac 


aaaatagctg 


360 


agacaaaaca atgtacctca 


aaatgttttg 


caggactaca 


ctgtcttttc 


cttcagaaga 


420 


ggcttttgta tgtcttctta 


ctcggcttag 


ctccatcttc 


atctgtgcat 


acttacgctg 


480 


cactgtcaaa cagtaacaaa 


aagccctaat 


caaagtgaga 


aacaatacac 


ttacctctgt 


540 


ctgatctgag gtcttatcag 


atcagtttta 


attggactga 


gtgtcacctt 


cagatttaat 


600 
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gtcttcactg gtcccattga cctcattctt agtgtcactt tccttcgatt cagggtttgt 660 

gtcttcactc tgcttttctc gactactgga cttcacacta ctctttttat catcagatcc 72 0 

cttcttttct gccataaaag aagacattga ccatggaatt tacagtgaaa cacaacaaat 780 

ctactactat ttaaaacagg aaacattgag agcaaagaac catttcccac ggctgtgaga 840 

cagtctgcag agctcagtgc tgtcagctgt agattacaga gggaaagggc aagaagacca 900 

aaaagaaaca gaaaattctc tcaaacctgc tagagtcacc ctataattga caacttcagt 960 

aaatttcatt ctagataaag accatccaaa ggcaacaaga aagacaaaag acaaatacac 102 0 

aaaacggtta acatgggtta attaagttgc aaggcaaaat ggcacccaga aatttaacat 1080 

ttaaatgaca cgtttattaa aagaaaactt atacatattt aaatatccaa gagggggcac 1140 

acctagtaag aaaatgagat cttcaaaaaa atctaaaaac ccaaatttgt gttttctgta 1200 

atgaaatcta agaggttttt ttcctaaagg aaaaaattaa agctgagttt ggaggaaaaa 1250 

aaaggcagca gtttccagct ttccctagag ttgcattaga aaggtgaaaa agcccatgtg 1320 

aaggaaaagt gtcgtctgcc ttccagctgt acattacatg tacctctaca aaaggattct 1380 

gcaaaactat taaacacacc tttccataaa agctattttc cccttaaaac taagaacaat 1440 

tcattagttt gacaaagtag ctcataacag gaagatgtct gaagagaagc actctccaga 1500 

cagttgttac agacacaaca tcaaacacct ggaaggagga tgggagcaca aaaaccagtg 1560 

ctgtggactt gcatctacag aaaggaagga gtgtgcatca ggtccagcct ttgtaaacca 1620 

agcacagcac tgtctaagtg gttaaccata gacttgcttc ccagctgtct gtccacctgt 1S80 

gggaactatg ctctaactga ccttgtggga aagtcctccc tttgagcctt atgggcttct 1740 

ccggtgcaca ggtagcttca atgaggaaga gtcgtcttgt tgctccgtag aggactagta 1800 

gtccaaaaat cttatcttct ggcctatgta aatatggtag atcttaatag tcctaaggat 1860 

atccatgtgt gttggatttt agagttaggg attaaacaga aactctctga acagaatata 1920 

gatacctatg ggtagaaaat tactccccct ttgttgagtt ccgttaggtg cttctgttat 1980 

gattacatca cgtggagcag tgaacagtct tactcatctg ccattattag atgtttgtta 204 0 

gtacagatgc ctgatgccca caggccagaa tgctatcttt gtatattcag aagtgtctag 2100 

aacttcccta agtcaataga ctcatcattt ctagaagtct aaagtttacg ttaatgctat 2160 

aatggcaggc ccaacataaa taggcctgct aagttttttc agtatctagt cttctctttt 2220 
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ccctagtctc accaggaaac 


ttcagaattt 


tactaaattc 


aacttgtcac 


aagtcagtaa 


2280 


tcaggacatg tttttgcttc 


aacacaaaat 


tatacaaatg 


ttgtctattt 


aagtggcaag 


2340 


actggaaaag aacaattaat 


aaaacagaaa 


tatgtaatca 


aaagcttcat 


aatacatggg 


2400 


tgattgattt gatctgatgt 


aagagtcatc 


aggtaactct 


tcaaaactga 


tctcacttaa 


2460 


aatataccaa agcctttcta 


aacaaaacat 


agtacaagag 


gagacttcac 


agaaacaatg 


2520 


tctgaattat aataagcaga 


ggc 








2543 



<210> 95 

<211> 2305 

<212> DNA 

<213> Mus musculus 

<400> 95 



acagaccttt 


tgcttttcgt 


tttctttgtt 


tttttttttt 


tccatcctca 


ccgtctcact 


60 


tcaccactgt 


gagaagacct 


ctccaccctc 


agagccccca 


aagaagagag 


agagaaagca 


120 


ggtgctgtct 


ctcttggccc 


tctaccgggc 


ttagttggtc 


tctttggcag 


cctgactctg 


180 


gatatgaact 


gagacccatc 


tttgaactgg 


acacgaacta 


taaaccagtt 


ctattctgtt 


240 


ctgtgctgtt 


ttgtttcttc 


ttgatcaaaa 


gccaagagaa 


atgtttttgg 


gaatgtggaa 


300 


ggccactctg 


gacatacaaa 


gcttcctcag 


gttcagtggc 


tccgtctccc 


ccatctctcc 


360 


ctgtcagcta 


tagcacttcg 


cagatctctg 


atagtctgat 


ctttggggat 


attgttgtga 


420 


atcatgggga 


cacttatctc 


acagtgactc 


gccatacgcc 


agccatcttg 


ggtcaacctc 


480 


ctatgtatct 


ttcctatgat 


gctcctccga 


ttgtcccact 


gggaagtcac 


ttcactggct 


540 


gaacatctga 


ctgctgctgc 


cccagccact 


gaatccagag 


aatggtgggg 


accccatgtc 


600 


cagctgatag 


atcatcatat 


cacacaggct 


ctggagtcta 


gcagcttttc 


accaacagag 


660 


aggcgttcat 


tgccttttta 


aggacgccgg 


agtgggggat 


cttcattctg 


cttttagcat 


720 


gtggctgcct 


actactgttt 


gccttacatt 


cttgcctggc 


ttcttctgga 


atcttccctg 


780 


gttcttccgt 


acctttcccc 


cacccaatct 


tcagactaga 


agtgaggcca 


taaaccaaga 


840 


aactatgagg 


acatccaggc 


tatctagctc 


atttggaagg 


agatggacat 


tttctttccc 


900 


tgtgcaattt 


ttgtgagtct 


tcagggtgtt 


ccaacatata 


tcctagaaga 


tgaggtgcca 


960 


agaactgccc 


agcctctgac 


tatagggcac 


tctaccctgc 


tgtgtgtcgc 


tcaatctttg 


1020 
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aaccttagat 


ctttccttcc 


acatgattgc 


aagtgtccag 


ttcggagaaa 


aatgccaact 


1080 


cctcccccac 


tctccaggaa 


tatcttgaac 


ttttattttg 


actctgagat 


ctaggacaca 


1140 


acagaactct 


gtccacagca 


ataatccact 


cagtactatg 


ggatggatgg 


gttagaagat 


1200 


gcgttactga 


gcctgagact 


ggtcagccag 


ccactggaca 


tgcgacctgc 


cfccttgtagc 


1260 


ggtgccctgg 


acaggttgac 


actctggtgt 


atgttacaag 


acaaagtgct 


gcatgtggtg 


1320 


aagtggtctg 


cctggggctg 


tgggatgtgc 


tgctgttatg 


aatttggagt 


gaatggggga 


1380 


tacagatgct 


ggttatccat 


ctgtagctca 


ttgcagtgag 


cgtgatgaca 


.gggggcacgg 


1440 


ggattatctg 


tgtagcatag 


gcatgtgacc 


tttgactaag 


ctcttacccc 


tgcactgtaa 


1500 


tgtgctgaaa 


tgtctttgta 


gacctgaagg 


tgcacttaac 


taaactgcct 


actaaggaat 


1560 


gactctcatc 


tggtccactt 


cacactccct 


tccaggtgtt 


tgccctttca 


cccccttctt 


1620 


aggtattgag 


ccttgatagc 


tcaagcccca 


atgctaagac 


ccgttttctc 


tgtaaggaga 


1680 


agacttatga 


tatactgata 


gccatagccc 


atctcttcca 


acatgtaggg 


ccctagttcc 


1740 


atagggtgcc 


cttggaagga 


tcaggtcatc 


ctgtggctat 


tagccatccc 


agaatctctg 


1800 


aagtgcttaa 


ctcacttcaa 


gtgatctgaa 


tcagagagac 


caaagatatt 


aattttctct 


1860 


gtcctcagat 


ttctagaaga 


cagactatgg 


ccaggaaagg 


ctttattttg 


gtttgggctt 


1920 


tctctctctc 


ctttctccat 


ctgtacacat 


cccctacctc 


actccccact 


tgagacctaa 


1980 


attttgaagc 


tcaatgaccc 


tgttcactgg 


gagtgtgttt 


gtgggttagt 


ccagaatatc 


2040 


aggtccaaca 


atttccccca 


cactgtagtc 


aatggacagt 


cattttctct 


aacacaaaac 


2100 


tgtgtggtgg 


atttagattc 


cactgaggtg 


gaactggggt 


tgccacagcc 


ttgatcttct 


2160 


tgagactaga 


cagaacaaga 


gtggtggctg 


agaaatacat 


tttcctaagt 


agaaatgagg 


2220 


tacaacccat 


aaaggacttt 


cttcacagtg 


agcgtagcag 


ttacgacttt 


ttcaaattag 


2280 


agtctaaagg 


aatcagtaaa 


gaagt 








2305 



<210> 96 

<211> 2771 

<212> DNA 

<213> Mus musculus 

<400> 96 

gctctcttga cagatgcacc ttcccgttgc ccctcgagtc tcatgacatc tcctgtttct 60 
gctggggatt accatcttcc gttggaagaa ctgcttttag gatttcgtga gtgtgagcct 120 
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attctggaga 


gctgctttgc 


ctggagcgga 


tctgctgatg 


gggaattcag 






tttttcattt 


cgacacttta 


ggctgtcacc 


ctgctagctt 


gcagcttgct 


240 




ttcbcaggga 


gagttcagct 


ttcagccttg 


tttttgcttc 


ggaaagataa 


300 




t t t t c t tgtg 


tcttttttta 


cagtccccca 


ccccaagtgg 


gggttcagag 


360 


cctgg'gctca 




ggcagtttaa 


gctgcctttc 


gtctcccgtg 


ggtttttcca 




gaag-g- 1 a t c c 


ggg tgt ggt fc 


tcfcfctgfcfctc 




gggcttgcag 


atcttcccac 




a. t c 1 9" t a. a. 9" c 


catgccttct 




gaagttctta 


alzggttagtc 


ctcaaacaca 






tfctctccgtt 


tccctcttgc 


^SSTQCtgcag 


ttacacctgc 


ccacgtggta 








cccbcbtcbg 


ttcccttata 


tgctttagtt 


ctgacctttg 


660 








gctactacat 


ctaaaacaat 


gtttgttcag 










ggctgatccc 


acagctggtg 


agatccaggt 




ctgatttcat 






aactttccca 


gcggtttgtt 


aaaaagccca 










ttccfcfcctag 


actgtctgct 


acattccaat 










ctctctctct 


ctclzclzctzcli 


ctctctctct 






^ 4. ^ 1. i. 1. 1. 1- 


t t t t t t c atg 


ctaataccat 


gctgtttgat 


ggctagaact 








aggagaaatg 


tctcctgctc 


tattctttgt 


gctttctggg 








^999 1 c tgaa 




gtccacacac 


agttggggat 




tgtttcttct 




agtcccctgg 


gcattgttat 




gcattgaatt 








gtgcatgt'tt 


tagcagcacli 


gagt cttgta 


gcctgcaaat 








g t gggc tgli fc 


ccactlicligc 


ccacttgctt 


tataattttt 




gttttacgat 


cgttcacttc 


fc fc fcaac fcaaa 


cttgtttcta 


attgctctac 


tactgataga 


1380 


gcaattaaaa 


atgattttta 


ttcttttcag 


ctaagtccct 


attggcaggt 


ggaagagcca 


1440 


ctgtcgggtc 


gggtgtgtgt 


cctgtggctt 


tgctggggtt 


tttagttgta 


atcatcttga 


1500 


gtaacatctt 


caagggattc 


tgtatatatg 


tgacacgctg 


tcgtccttga 


acactaacat 


1560 


catttctcat 


cctttctggt 


ctgggtctct 


gcattactat 


ctatctcgta 


ggactttcac 


1620 


tgttgaaatg 


agtaagtgtg 


acctggagtg 


agtgtgccct 


ggagtaagtg 


tgccctggag 


1680 


taagcgtaac 


ctggagtcct 


tgtcaggttc 


tggctccctg 


ccttctgcgc 


agcagacacg 


1740 
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aaccgcagac acggactttg gaggccaagc tcttactgtc acaggcatat 


1800 




cctactttgt gctgatgcca caccccatca aggcctcccc agtatctacg 


1860 


S-Qa-Ccatcat 


gtgggtttat catctcttct atagatgggg tgcactgtat ttgtcaattg 


1920 




agccatcctt gtctgaaact gtggaattac taggagaaat gtaaaggaaa 


1980 




cattagctca atgtctcttt gatggtgacc cccccaaagc atagcaaatg 


2040 




atcatgtcaa actacgaagc ctccgtactg cagaggaaac aaccaacagt 


2100 




catcctgcaa acagggagga aatgccagcc aatcacacat ctgatgatta 


2160 




atgtgggaaa taaaatctct ctctctctct ctctctctct ctctcttttt 


2220 


tttttttggt 


ttttcgagac agagttattc tgtgtagccc tggctgtcct gtaactcact 


2280 


ctatagatca 


ggctggcctt gaactcagaa atctgcctgc ctctgcctcc caagtgctgg 


2340 




gtgcaccacc accacctggc agaataaaat atcttaattg caataaaaac 


2400 




acaggcaata gtaccaaatg ggtgaccaac aggtacctga aaaaaaaagt 


2460 


tcaatattag 


ggaaacaaac caaaactaaa atgagctagc ctcatgtact ggctgtaagg 


2520 


cattttctta 


attagtgatc aaggtgggag ggcccattgt gagtggtgct atccctgggc 


2580 


tggtagttgt 


gggatttata agaaagtaag ctgaacaaac caggggaagc aagccagtaa 


2640 


' gcagcattcc 


tccatggcct ctgcatcagt tcctgcctcc aagatcctgc cctgtgtgag 


2700 


ttcctgtcct 


gacctccttt ggtgatgaac agcaatgtgg aagtgtaagc tgaataaacc 


2760 


ctttcctccc 


c 


2771 


<210> 97 






<211> 1629 






<212> DNA 






<213> Mus musculus 




<400> 97 






aacctggaaa 


cgtctaaaac tgaaagcaac tctcagcaca catacacaca ctcacaacaa 


60 


taaagtaaac aactttttta caaagggaaa agtagattgc atagtttcag ctatgtgccc 


120 


tggaactaac agatccacct cctggggcct cccaaggtgc ggaaggaagg tcggtgttaa 


180 


aggtgtgcca caccatattc cactttatta atgttccaag tcctggtctg tcactaacca 


240 


tctttaaaca 


ccgcattggg atcgtcttgt taaggcaagg gctgggtggg cggttggttg 


300 
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tattgttgat 


tcttgtctcg 


ggggctgatt 


ggtcattttg 


tcattttgaa 


aggtgtgctc 


360 


actcatacac 


acaagcacac 


accgtgtgtg 


ggaatccaga 


gtggcttggt 


taagaggggt 


420 


tcttccagat 


agtttgcttc 


tgtttcaccc 


tcagggattt 


tactgacatt 


tcttggctcc 


480 


ggagctttag 


atctcaggag 


tagtgtagat 


tcccgagagc 


agcagagtct 


ttcaggccgc 


540 


agtagacctg 


ttctctctaa 


tctgtatgct 


aatgggcata 


atgttgaatt 


gatgcccctt 


600 


aaaatgaaag 


ctcgacctgt 


gtctcactgt 


ggtggcaggg 


taaagggaga 


tggcatttaa 


660 


tgcttctctt 


gtgctgtttg 


caagcctaat 


accacacgtg 


tgtgtgtgtg 


tgtgtgtgtg 


720 


tgtgtgtgtg 


tgtgtgtgtg 


tgtgtgtgtg 


tgtgtgtgat 


ttgtattgta 


tatacaaagc 


780 


caaactactc 


ttttggtttc 


tgtaaactgc 


tatgaaactg 


ttctgcattt 


cagctttctt 


840 


gtgtattttg 


taagaattgt 


taaatggata 


tttggcatct 


gcttgactgc 


agtaaatcag 


900 


gcggatgtaa 


tgaacttaga 


atgtctgtaa 


gcaagtgttg 


atgtcttcat 


ggatggagga 


960 


agaaaattag 


ctaatgatgt 


ctagctaaat 


gaatttgtaa 


tttgtttttg 


tagttaatcc 


1020 


actgaagagg 


ttggaatagt 


tgatccattc 


ttcacccttg 


aataatattc 


tgttaacaga 


1080 


atatattcag 


atttcttcag 


tggctttgat 


aaatctatgt 


caaaattttt 


cagataaacc 


1140 


cagaagctgc 


tacttgaaag 


gtgaaaaaga 


ctgagttaga 


aaatctttct 


gacaggtttg 


1200 


agtgagacgg 


gctgttacca 


ggagcccatg 


agtgatagtt 


agatttatag 


gttactaaga 


1260 


atcaggtatt 


ttctttatgc 


tgagtacctg 


acatgtctat 


cagcagcacc 


tctgattctt 


1320 


acaacaaaac 


tttgtaatat 


tcctgacggt 


gccacagatt 


atctggagaa 


aaattacttg 


1380 


cctgaggaca 


gaagacattg 


cacatgtgtt 


cctcacagaa 


gctgctagaa 


aagaaggagg 


1440 


cagcatgtct 


gtctgtctgc 


agtaaaggag 


gcagcactcc 


tcttccagcc 


ctatgctgct 


1500 


gtacccctca 


aatggtcccc 


agtcaggtcc 


taaggctaag 


gtcatcatag 


cgttcaatta 


1560 


aaacagggaa 


aaaagggtgt 


gtccctagaa 


atccatctca 


actttagatg 


tgataaaagg 


1620 


aataaacgc 












1629 



<210> 98 

<211> 2277 

<212> DNA 

<213> Mus mus cuius 

<400> 98 

attttttttt cttttttttt cctttctttc attttttttt ttaaaagatc cagccttaat 60 
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aaaaggttgt attaaaagca cctggctagc atgtgttatg caaaaaatag cagagacaga 120 

ctgactgtca atttgattgc ctatttttat aggaaggtaa aaatctgaat tactaaagaa 180 

atttagataa agtcctaaag ctaatttgtg ttaaatgttt taaagatctt attacctgtg 240 

agagatctgt gcattgactt tttgaagttt acgtaaattt aagagcagag cttaaggtct 300 

gtgccatagg ttgaggtaat gagcagcttg agaggtttgt acttttcttc agtgttgtct 360 

tctctgcttt taaaacactc agaaatatcc ataccactgc cagttcattt taaagcatct 420 

gtgagcaggg tattttttga aaggtatcat tggtgtttac cacaactgtt gttagctctc 480 

agaaatgatt tttcttaagt gtttaatctt gttgaaatgc aattaccttt taatttggga 540 

gttgtatcaa aatctatttt cttgttttgt atataggtct tgtaagtttt tgtcattttt 600 

tagctttttt taatagttca attacaatgc cctaatgtga agctgtgtgg acatacatgt 660 

gtttacaagg gggattagtt aatagtgaga ttaacaggta agtacaatgt ccatagatag 720 

aatttctctg aacattttaa aaatggaatc aaaaaataca tatacttgaa atgaaactgg 780 

agattttgta ctaattttaa tcctttatta ccttttattt tttcttttta aaaaaatcag 840 

ttggaaccct ttgtttgctc tccttatgtc acacttcttg cttggcctat ttcactcata 900 

agcgactttg gatgcagaca gtgtattagc cttgcttgct gctgctggtc cattacttta 960 

agtgttgtta ggaaaaaata aaatagcttt ggtttgtcta attttgcagt gtctcctacc 1020 

tgataacaca tgaatgtata gcagtaaaag taaaaattac aaaccttgtt atttgtcaaa 1080 

gcccttctgt tgtctaggag ccagtttcat ttgaaaactg gtttttaatg tgtaaaagca 1140 

atccaaagag atgaaatttt acagacacca aaaaagagag agagagacaa acaagtcaga 1200 

gagaagaaaa aaggagttaa acacgtgtag ttcatataga gaggcagtga cagcttgagc 1260 

ttgtcttgag tcccagcgtc gctcagtgct gtgtagtcac attccttgca gctagtagtg 1320 

agattccaca aggtatttct cctgacgtgg cactttgaat tcaatatact gctttgaggg 1380 

aaaggaaata gatttgtaga aaggattcct tcagtgtaaa tttacatgtg tttatgtttt 1440 

gtcagttttc atcagcaaga tgagcaagat tctaagtgac ttgccatagt tgttgatcat isoo 

gtcaggccat ttcatgtctc cttaacctat aatcttaagc agttaactac aatttggaat 1560 

ttacaaatag catgtagctt aaggtattat tttaatttga ttctcctatg gagactgttg i620 

ctttagtttt tactagttta aagtcagttc taaaaataaa catggacgct ttcatgaaca i680 
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ttgtataata ttttatagta aaggggggga agttgtttaa agggtaagac tttatctttc 1740 

cctgtgttga cacgtgccaa tagcctgagt acattctgtt tcatacaaca agatgccact 1800 

gcaagactct tccactggag tgctcatgtc acttcagttt tgctacccaa gtggttgttg 1860 

cagtgtctat atcttgcaag gtggccccaa tattgtgttt taagcttcag cttcagttat 1920 

tttaagagga agtttatgct ctgattcttc actgagaatt cagtattgag attcttgcta 1980 

ttacacagat agactgattt ctaaatttta ttcatgaact tcatttgcat cccatagccc 2040 

tctgtaaatg tttcccccag ttgtatgtac ttactaaatg catgcttttt cattgtacat 2100 

actagacatt ttttgtgcta aaatttatta agtagggttt ttgtagcaaa gcattctatt 2160 

tttatgttct tatagtgtat gtgttaacat taatattcag tttaaacact gigtgtattta 2220 

tttttagcaa cttgactcct agaagcctta gactaatttt taaataaaca ttcaacc 2277 

<210> 99 

<211> 2518 

<212> DNA 

<213> MUS mUBCUlUB 

<400> 99 

taagtatttg taatccctga agggaaaggt accaagacct tggagtcttg agcatcatct 60 

tagagcatca gtagctgaac agttaacatg tctaaatgtg tttgtccaca gtgtgtgcta 120 

tatgtataga atgtttgtgt gtacttatgt agagtatcta atatacattg aatggtacct 180 

tgctgcagag tttctgacat ggagaattgt ctaggactac ccttagcatt agctctgcca 240 

ctgttcctgt tacaggactc actagtgttg gctctagagc ttctggacat gcttatagat 300 

aatcagatca caaaggcact tttaaagctg tggactactc tgtctacatc atttatcttg 360 

aaaaataact gaagcagtct tttgtgagtc agaatcccag cctgccctag tgtaatgctg 420 

tctacctttg gggataggct aaaaatgtgt cttccctgag gcctctggaa ggatgctgca 480 

ttctcagatc tgatgggact ctaagagtct ggcccatagt cagccccatt tctcccacag 540 

tattgattcc taagaaatgg aaggtggtcc attgaacagc tctgctaagc ctgttgccca 600 

gtacacggta ccgggagctc ttgtcccttt cctttccatg gtacttcatc ctgtccctcc 660 

tatacctact gtcatcccta cctcggtgcc ctggagtctt ggaattcctc tccatatgat 720 

aaagcattat ttatcttaca ttaatacctt aaaggatggt tcttagtatt gaaaatatga 780 
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acaagacatc 


aaaagacttt 


gatgatagtc 


ttgttgaaaa 


actgaccatt 


gaagtacccc 


840 


ttgtataaaa 


tctgtgtgta 


tcttgggctt 


cacccctacc 


cccacatgct 


agcttctaaa 


900 


aagatttacc 


aggaacccag 


ttttaaaaat 


tctacttaga 


gtccttgttt 


tattggtata 


960 


tcgttagcat 


ttctgaaaat 


gtttttctaa 


aagtctatat 


aatttatcta 


actcttagta 


1020 


atgcctaatt 


aagtcaactg 


ttacttattt 


atgtttccat 


tattccaaag 


gaaaccagca 


1080 


tgtagaagac 


tcctctggtt 


gaaatagctt 


ataggttata 


tgtaactttt 


taaaattgtg 


1140 


cttagataga 


gccattttct 


catgcaacga 


tcttgtatag 


tgtgtaaggt 


gctctgtgcc 


1200 


cctctgtttt 


ttcccctcat 


cctgaaggta 


aggactattt 


ccaaggcaaa 


tgtagtgtaa 


1260 


gggaattggt 


tctgaaagag 


gaaaacttac 


tggttttcaa 


ggcccagcta 


gggttgtgtt 


1320 


ctaatctcct 


tgggaagttg 


agagcaggaa 


ggtacaaagg 


ttcttcttcg 


ccttctgccc 


1380 


ctttatcatg 


taaagaacaa 


ggatgaagga 


ccatgctctg 


aaataagaaa 


agaccgtcct 


1440 


gaaatactca 


tcatctgtct 


tttttattag 


aagtcaattt 


tccattgtct 


ttccaaataa 


1500 


gagagacctt 


agaggaagtt 


agctgaacat 


ggaaatttaa 


ttctttggct 


ttttttttct 


1560 


attaatcatc 


agttcaacct 


tgcttggcat 


taagtttgtt 


agttctgaac 


acctaagtat 


1620 


ttgtggctta 


gctattatgg 


atacaagata 


tttttcagac 


cagcagccac 


gttcctggtt 


1680 


ttggcagaaa 


atgatagtgt 


tgtagttcaa 


gtagcagtgt 


attttcccta 


taaagctcac 


1740 


actagcaccg 


aagcagaatc 


atagcatcag 


ttaaacatat 


actgcagttc 


aagaatgata 


1800 


atgcatcttg 


gggtctgagg 


catatcagct 


cttgtttcag 


agcatctgtg 


ttactcagac 


1860 


agaaacagac 


cagacagttt 


tacaggctcc 


agcacctctt 


tcttaccttc 


attcacccag 


1920 


gccaggcagc 


tggtaaatta 


gcagaagtct 


cggcacccat 


ttgacatccg 


ttgtcatcat 


1980 


gccttccaga 


ggactctatc 


aagccacaaa 


aactaatccc 


tggcatatta 


cagaggtact 


2040 


ttgtcactgc 


acttcctttc 


agactccatt 


tcatccttca 


tacttaacag 


gactgatatc 


2100 


tgaggtcatt 


ctcattctgt 


atctatagtg 


ggcatcttat 


agcagttggc 


tcattgcatg 


2160 


tggagagaat 


gtcagtaaat 


acttgaatct 


gcatggcagt 


ttgctgacgt 


gtatgcaaga 


2220 


gcaaggagcg 


aggaatgcag 


caagttactt 


gtacatacag 


taggtttcct 


taggtcttgt 


2280 


gggcacatcc 


cttgtaattt 


atatgtatat 


ctgttgtact 


gaagcatttg 


gggaaatact 


2340 


tgtatagaaa 


gtatgtatgt 


catatgtcaa 


aggaaatgct 


ttctgtcccc 


tgccaatgtt 


2400 
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gtatttgtgg gtatttatag ttgtatgtat gtatgtacat caatgtgtag attgtgtatc 2460 

agtatatggt atatgtatgt tgaaattatt tttgtctttt agcaaccagt ataatatg 2518 

<210> 100 
<211> 1950 
<212> DNA 
<213> Mus mus cuius 

<400> 100 

gaaaatgtag aagacggaat caatgaaagt tcattccggc aggagaggaa aatcccatcg 60 

aagaagggga agagctgtcc ctccttagcc cccccgggga aacgagcccg aatcccgcgc 120 

tgagccgcag ccgatgctgg cagtctgcac ccggtgagga ggggcagaaa aaaaggcatc 180 

tggaggagga ggacgcggag agcggccggg gcacgagaca aaaagcccgt caaggttgcc 240 

ccgctcccgg ggggagccgg ccagcgcagc ggccgccaag aactcaagcc caacgtgggg 300 

gccgaggggg ccgggctgga cctgggcgag cggttctggg cgttccggcc gccagtgggc 360 

acgggcgccg agagctggga gcgtctgccc gcaacctcac agcctcaggc gcggctcgcc 420 

cctctccagg cgctcagcat ccctccgcgg cctcgcgctc gcttcagcgc cctggccttg 480 

gccggggtcc ggagcaccga ggcgtgtccc tgaccgtccg gctgcaggag ccccgccagc 540 

ctccgtgtgc ctcagcgacg cccggacgct ccgctgccgc gctccagcct cctccgccgc 600 

cctctggagg agctagggta acggcctcgg aaccgcagtg ccgcagcaca cccgccgccg 660 

gcctggcggg cgagcgcgcc actccctccg cgccagtatc cttccgccga ctccccgccc 720 

tcccgcgcct ctacctctgg attgtgggag aagaaaaccg ctgcaagagc gcgctcccgc 780 

gtgagcgcgc ccccggccgg cccgccgccg cccagattct ggcggagcgt ggacgcgcgg 840 

acgatgcagc tgcggcgggg gcccgcgtcg agtccgcgcc ccggacggat gccagggtgt 900 

agggggcgag actgaggaac cgggcaagac tgaggaaccg ggtctgtctg cttgctcgct 960 

catgtctaat tatggtttct ttcctcctgg agcgccttct ttcatgtgtg aaggaaaagg 1020 

aacgaaaccc atcttttttt taaacacgag gggtgcagac atcctctttc ggacactttg lOSO 

atttgctctg ttaaacatgc tttctcgccc cctaaaaggc ctaggaaagc tgcagcaaga 1140 

agcaaaagca catcttgggg ggatggaggt tgtttttgtt ttcccgtctt ttcatatata 1200 

tatgaaatat ttatagatat aatatatatt tatataattt atatattata tatatatgaa 1260 

atatttataa gagtcgagaa taaccaaagc tgcctttcaa catgggattt cctgactgga 132 0 
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gfcttccatga 


tgcaggcatt 


attttttgac 


ttcttggtca 


ctgggaaaaa aaatccaaat 


1380 


tgagagccac 


aagagaaaat 


atttagaaat 


catccaagtt 


gtgtgacaaa catttgttaa 


1440 


aaatcattta 


tcttatgtac 


tagagggccc 


atcaaaatat 


gacagttcca cagcaaatta 


1500 


aatgttatlic 


ttcctcatac 


ccatagcatg 


tccaggataa 


gatggaatag aaaacatctt 


1560 


tttccaaaca 


caatatggta 


ttgtgatatt 


acaatgcatt 


ttatgtttgc acagtgctta 


1620 


caagtaccta 


tcctctttga 


ctactcatat 


gtttaatgat 


atctattttt gtatgtttta 


1660 


tttttcattg 


attgtatgaa 


tttttgcctt 


ggatgtatgt 


ttcttcccca cttatgtgcc 


1740 


tggttcctgt 


ggaggtcaga 


agagggcact 


aggttccctg 


gaacagaggt tagaggcagc 


1800 


catgagccac 


tggatgccag 


gaatcaaacc 


ctggtcctct 


gcaccagcaa ccagagctca 


1860 


taaccactga 


gtcatctctc 


catctctcat 


aatttaaggt 


aacagtaaac atgttttatc 


1920 


atatactaaa 


ttaaacacca 


caatttcttc 






1950 



<210> 101 
<211> 3530 
<212> DNA 
<213> Mus mus cuius 

<400> 101 



gtgtggccac 


tgcaggtgag 


aagtgcccaa 


agggagaaag 


tccagaagct atgatctccc 


60 


tcccttccct 


gcttgctcaa 


gtgcctgtgt 


ggaagattgg 


gtcaatcagc tctcaccagc 


120 


agcatgcaga 


aatgcagcaa 


gtctgtctgt 


cccctgaaaa 


tgatttagta ccgaacagct 


180 


cctccacatt 


ttcatacagt 


catctcttgg 


gtacatgccc 


tccctagagc tttggctcct 


240 


gcctccccac 


atctcaggac 


ctatgtcaga 


ggttgtttgg 


cggtgagttc tagtgagtca 


300 


ttttttcctt 


tctttgacat 


gcctgaaaat 


gcttctgtgt 


gtgctgctca gaggtggcct 


360 


cgaagaagtt 


ctggctctgt 


gagtaaggca 


gagtggagcg 


ggcaccttcc tagtgggaag 


420 


taaataggcg 


ctttggcttt 


cccaaaagca 


aacccaagga 


actcttcatg gtcttggaag 


480 


atgcattctc 


attcgttaaa 


ggccgagtgc 


tgactctggg 


tttgaaaact gcattttctt 


540 


gaggcaggtc 


tcaggcttta 


caagtccaca 


ctatgcaaga 


agagaggcaa ggaaggcgag 


600 


gaaaggcgat 


cccggcaggg 


gtggggggtt 


tctctagccc 


ttgtttttag aaggtgcatt 


660 


gccagcgctg 


gttatgattg 


gcagctgatg 


attaggaatc 


tcgacactcg tatttcatgt 


720 
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ctgagaagaa 


caccatttcc 


tgagagaaga 


cgaacaggcc 


tagtctaccg 


ccggcgtagg 


780 


ttttgtcata 


gatgggtccc 


gagtcagcca 


tgattttctt 


tccttcatac 


atcaccactc 


840 


tgatataacc 


ggtctttggc 


ctgtggctga 


gacgccatct 


gtatgcagtg 


aaatctttcc 


900 


agccgatgtg 


gcgagggtca 


tgccacaggg 


tgcgcacctg 


gccaggggtg 


tttcctgtgt 


960 


gccacagtgc 


attccgcagg 


tgctcgccag 


ggccggtggt 


ggagttcaca 


acctttacag 


1020 


acaggcctga 


gtatccctga 


gcccttgtgg 


ggttggtgtc 


ccagtaggac 


tgggtgactt 


1080 


gtctccacat 


cacaacgtag 


aagcggctgc 


tggactggta 


gccgaaaaca 


aagccagcgt 


1140 


agtcatcatc 


tctctcggtg 


ttgatgaaga 


aggtaccgct 


gaagtccaca 


gcattaaact 


1200 


catcataacc 


tacagcaagt 


ccagggtcac 


agtttacagt 


ctggacaagt 


tctttgccct 


1260 


gatggcggac 


aacccagtta 


gggtcatttt 


gggaggttcc 


tttgggatct 


agaggaatca 


1320 


tctggaattg 


tcggaaatcg 


gtttcactga 


tgtcaaaatt 


ctcaggacag 


atgtcatcaa 


1380 


tatctggcac 


attgtcatgg 


tcaaagtcgt 


ctttgcaggc 


gtcacctcgg 


ccatcaccat 


1440 


cagagtcctt 


ctggtcagga 


ttgggcacca 


gcctgcagtt 


gtctctgtca 


tcagggatgc 


1500 


cgtcattgtc 


atcgtcatgg 


tcacaggcat 


ctcctttgcc 


atctttatca 


tggtcggcct 


1560 


ggttggcatt 


aggcacatag 


ggacagttgt 


ccagattgtt 


ctgatggcca 


tcctcatcga 


1620 


tgtcctgatt 


gttgtcacaa 


gtgtccccta 


tgaggtctga 


gtcagagtcc 


agctggtctg 


1680 


gattgtgttc 


cagggggcag 


ttgtcacact 


gatctccaac 


cccatccatg 


tccgtgtccc 


1740 


tctggtccac 


gttgtaaacg 


tactggcagt 


tgtctcgttc 


attgaggatt 


ccatctccat 


1800 


cgatgtccac 


agcacaggca 


tcgccctccc 


cgtttttgtc 


tgtgtctgct 


tggtcagggt 


1860 


tgtggttgta. 


ggggcagttg 


tcacagcggt 


ctcccacatc 


atctctgtca 


tagtcatact 


1920 


gggctgggtt 


gtaatggaat 


ggacagttgt 


ccctgtcatc 


ggggatcttg 


tcgttgtcat 


1980 


cgtcatcatc 


gcaggcatcg 


ccaatcccgt 


ccttgtcata 


gtcttcctgc 


cccgagttgg 


2040 


gaaggttggg 


gcagttgtcc 


tttttgcagt 


ggtaggttgc 


gttggccaca 


cacaccaggt 


2100 


tttcattagg 


ccagccgtcc 


aggtctgtgt 


cctctccgca 


gatgatgcca 


ttgcctgcat 


2160 


agccgggctt 


gcactcacag 


cggtacatgg 


ggtcactgta 


gtgacccagg 


tagttgcact 


2220 


tagcgttctt 


gttgcagtca 


tgcgtcccgt 


ccgtgcaggg 


gtttcgcggt 


ttgcacacct 


2280 


■gtttgttggc 


catggcatgt 


tcgacacctc 


ggccgaaggg 


ctgtgagcca 


gtgaatcgtg 


2340 
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gtgggcaggg 


caggcagttg tagccaggat ctgtgttctt gcaccgatgt tctccgttgt 


2400 


gattgaagca 


agcatcaggc acttctttgc actcatcgac gtctttgcac tggatgccat 


2460 


ttccactgta 


gccaggagga cacgcaccac atttccagct accatcaggg tagctagtac 


2520 


acttggcacc 


agcaaagcca gggattggac aggcatccat caattgggca gtcctgcttg 


2580 


ttgcaaactt 


gatttttctg tcacatcgcc aacacagtct ttgcctccaa actggggtgt 


2640 


ggggttgtta 


cagagtcggc tgcgtctctg cactcctcct ccacaggtga cagagcagat 


2700 


gtcccatggt 


gaccagggac cccagccctc cattaattgg gcaggcgtct ttcttgcagg 


2760 


ctttggtctc 


ccgggcttca ccttcacagg gcttcccgtt catctggggg ctgggggagt 


2820 


tgcagagccg 


gatccttgtg atcacaccgt caccacaggt cacagaacag gacgaccatg 


2880 


gagaccagtg 


actccagcca ccatcctgtt taaatctttt gtcacactcc tgaatgtggc 


2940 


aggtccttgt 


ctgtaccgaa gagccctcgc atctgttgtt gaggctgtca caggaacgac 


3000 


cacgttgctg 


aattccattg ccacatgtgg cagagcagga ggtccactca gaccagggag 


3060 


accagccatc 


gtcagcagag tcgctgggcc agcaccgtgg gcagcattca ccatcaggaa 


3120 


ctgtggcgtt 


ggagcagggc atgataggac aggacacctt tttgcagatg gtaaccgagt 


3180 


tctggcagtg 


acactctgtg caactgtcta cagtccactc ctcgttgttc ttgtactgga 


3240 


ctccattgtg 


aaagcagagg ggaggccgct tcagctcact gaccagctct ctgttctctt 


3300 


ccgtcacttt 


tcggatgctg tcctgcagag tggtcacgat ggtgcgcagg cccttcagtt 


3360 


ccaggaccat 


gctggatagt tcatcacagg agaggccaca gatagcttgg aggtcctttg 


3420 


ttttgtggcc 


gatgtagttg gtgcggatag cagggctgga accgttcacc acgttgttgt 


3480 


caagggtaag 


aaggacgttg gtagctgagc tggagcagcc tttgttcctg 


3530 


<210> 102 
<211> 1357 
<212> DNA 
<213> Mus musculus 




<400> 102 
tatagtgatt 


aattatcgat ttttccatct gctgaagatt ttccgtgtta tagctttaag 


60 


attaactttt 


acagaagcaa tttatttttc aaataagaaa agtatatata ttatttacca 


120 


cgttaatcag 


tatatcttct tagctttcac ttttaaaaaa aaaaaaaaac ctcttcttta 


180 


tctgggaaat gtcaaaactt acacagactt gaatgtgtga ttttgtgata gaagctttta 


240 
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gtactgactg tccttttatt gaagtttgcc agcagtgtgc acttttgccg aagctgaaaa 300 

gccttagatc aatgtgttga gcaatagctg gtagaatatt ttaataacat acataaactt 360 

aatggaagga gagcaacaca atgaaatata aaccagtttt tctcttgaaa cttgcttgag 420 

ttagttacaa aatctgttga ctagtaaaat gcctggtgga actatagcct tatttatgca 480 

ttggatatta gaactataac cttatttatg catcagatat tagaaactta ataaaatttc 540 

agtttgatac tctcaaatta tgggattttt atttttgtaa tttctaaatt actatctaaa 600 

caacaaaata caoacttgag ttaattagac atttttcatg cactgtttat gaaggaaatc 660 

acctcttcafc acaatgagtg tcatgcttac tctggaaatt catgtcatgc tagttggcac 720 

cctgccttcc ttcatgccca gcattttaat gtttgagaac ttactgcaag cattgtgata 780 

gagtagaaaa gaacctaaga cgcactaaag ggaggccagc ctcagccagt tttgtgacct 840 

tgacagtcag gtcacctgaa tcaccacaca gcctatggtg tgaaatactg aagagacatc 900 

ctcctcgaat cacttttaga tagtatttcc ttgtggccac tatccacctc catggctcca 960 

tgtagtggtg aaagatcagc tgttgtctct gccacagtgt gttgcctctg tggtaagatt 1020 

gaaaggaact gtgtgtacac agaggaccat tggtaagaag gcagtgagtg ggagcctatc 1080 

tcagcggagg cgtgggctga cagtgatggg ggcacagtga ggctgcatat caagggaact 1140 

gtgaaattta tatgtattta tacgtatatg tgtatatata tttatctctc tctcacatat 1200 

atttatctct ttctctctct ctctctctct ctctctctct ctctctcaca cacacacaca 1260 

cacacacaca cacacacggt tctgtgttca tgcacactgg aggaaggact caaattttat 1320 

tacagatggt tgtgagccac catgtggttg aactcatcac atctggaaga acagaggaac 1380 

agccagagtg gtcttaactg ctgagccatc tctttagtgc cccccccccc cccataactt 1440 

tttaagtaaa gagagtactg gggaaggcga gcagattggt tgggagatat acaggagccc 1500 

gcatgggaaa cagcagtggc ttagatttga gagagaccgc cgggtgagag gagaggtgag 1560 

agtttaggga gccatggtac tgagtttgat acagaattgg agagaggata gagagtgtaa 1620 

gggtattgtt ggtgttgcca tcgttggatg agttggaagg agtgggtttt acgggggact 1680 

attatagtaa aagagtaatg atgaatttac aagttctctt ttgttcattt taaattacga 1740 

aggctgtgtt ctgaacttta ccattagagg agccacacat cttgaagaaa tatttatatc 1800 

ttagaggaag agttgaagat ttttagactg ttggcaaatt gaaaacagta actgatg 1857 
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<210> 103 
<211> 4304 
<212> DNA 
<213> MUB mUBCUlUB 

<400> 103 

tttttttttt tttttttttt taagactgca gataatcttg ttgagctcct cggaaaatac 60 

aaggaagtcc gtgtttgtgc agagcgcttt atgagtaact gtatagacag tgtggctgct 120 

tcactcatcc cagagggctg cagctgtcgg cccatgaagt ggctgcagtg cctcgtgaga 180 

tctgctttgt tttgtttgga gtgaagtctt tgaaaggttt gagtgcaact atataggact 24 0 

gtttttaaat aagtagtatt cctcatgaac tttctcattg ttaagctaca ggacccaaac 300 

tctaccacta agatattatt aacctcaaaa tgtagtttat agaaggaatt tgcaaataga 360 

atatccagtt cgtacttata tgcatcttca acaaagattc tctgtgactt gttggatttg 420 

gttcctgaac agcccatttc tgtatttgag gttaggaggg cataatgagg catcctaaaa 480 

gacaatctga tataaactgt atgctagatg tatgctggta ggggagaaag cattctgtaa 540 

agacatgatt taagacttca gctctgtcaa ccagaaacct tgtaaatact tcctgtcttg 600 

gtgcagcccc gcccctttga tcacacgatg ttgtcttgtg cttgtcagac actgtcagag 650 

ctgctgttcg tccctctgca gatctcacct gtccccactg cacacccacc tcctgcctct 720 

tgcagacctc agcatctagc tttagttgga aacagttcag ggttcaggtg acttcttaaa 780 

aaaaaaaaaa aaaccctacc tcctcagaat gaggtaatga atagttattt atttaaagta 840 

tgaagagtca ggagcgctcg aacatgaagg tgatttaaga tggttccttt cgtgtgtatt 900 

gtagctgagc acttgttttt gtcctaaagg gcattataca tttaagcagt gattctgttt 960 

aaagatgttt ttctttaaag gtgtagctca gagtatctgt tgttggaatt ggtgccagag 1020 

tctgcttaat agatttcaga atcctaagct taagtcagtc gcatgaagtt aagtagttat 1080 

ggtaacactt tgctagccat gatataattc tactttttag gagtaggttt ggcaaaactg 1140 

tatgccttca aagtgagttg gccacagctt tgtcacatgc acagatactc atctgaagag 1200 

actgcccagc taagagggcg gaaggatacc cttttttcct acgattcgct tctttgtcca 1260 

cgttggcatt gttagtacta gtttatcagc accttgacca gcagatgtca accaataagc 1320 

tatttttaaa accatagcca gagatggaga ggtcactgtg agtagaaaca gcaggacgct 138 0 
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tacaggagtg aaatggtgta gggaggctct 


agaaaaatat 


cttgacaatt 


tgccaaatga 


1440 


tcttactgtg ccttcatgat gcaataaaaa 


gctaacattt 


tagcagaaat 


cagtgattta 


1500 


cgaagagagt ggccagtctg gtttaactca 


gctgggataa 


tatttttaga 


gtgcaattta 


1560 


gactgcgaag ataaatgcac taaagagttt 


atagccaatt 


cacatttgaa 


aaataagaaa 


1620 


atggtaaatt ttcagtgaaa tattttttta 


aagcacataa 


tccctagtgt 


agccagaaat 


1680 


atttaccaca tagagcagct aggctgagat 


acagtccagt 


gacatttcta 


gagaaacctt 


1740 


ttctactccc acgggctcct caaagcatgg 


aaattttata 


caaaatgttt 


gacattttaa 


1800 


gatactgctg tagtttagtt ttgaaatagt 


atgtgctgag 


cagcaatcat 


gtactaactc 


1860 


agagagagaa aacaacaaca aattgtgcat 


ctgatttgtt 


ttcagagaaa 


tgctgccaac 


1920 


ttagatactg agttctcaga gcttcaagtg 


taaacttgcc 


tcccaagtcc 


tgtttgcaaa 


1980 


tgaagttggc tagtgctact gactgctcca 


gcacatgatg 


gaaggcaggg 


ggctgtctct 


2040 


gaagtgtctt ctataaaggg acaatagaat 


agtgagagac 


ctggtcagtg 


tgtgtcagct 


2100 


ggacactcca tgctatggga cttgcatctt 


ctgtcctcac 


catccccaag 


acattgtgct 


2160 


ttcctcagtt gtcctctagc tgtttcactc 


agacaccaag 


atgaattact 


gatgccagaa 


2220 


ggggccaaaa tggccagtgt gttttggggg 


ttgtatcagt 


tgactggaca 


ataactttaa 


2280 


tagtttcaga tcatttattt ttacttccat 


tttgacagac 


atttaaatgg 


aaatttagtc 


2340 


ctaacttttg tcatttgaaa ggaaaaatta 


acagttccta 


taagatactt 


ttgaggtgga 


2400 


atctgacatc ctaatttttt ttcttttcag 


tgggtttgca 


gcgagggtct 


tgtatgcact 


2460 


aggcaagggt tctaccacta agccacattt 


cccaggaaat 


aaaatgttaa 


cagttaaaac 


2520 


atacacacaa atacacaaac accttattac 


cactttagta 


aagtgagaga 


tgtgcgtcct 


2580 


ttgtctcagt ctccacgatt tcagctgccc 


cttgtatgaa 


taactcagtc 


tcgctaaact 


2640 


gtttactttt atttacctgg tttgactagt 


tgcagctata 


taaccagttg 


tgcatgagga 


2700 


caacagccag tgtgtttgtt ttgtttttgg 


ttttttgtgg 


tacatttttt 


gtaaagaatt 


2760 


ctgtagattg aagtgctctt tgaaaacaga 


actgagatat 


atttattctt 


gttagcatca 


2820 


aaaaacattt tgtgcaaatg atttgctttt 


cctggcaggc 


tgagtaccat 


atccagcgcc 


2880 


cacaattgcg ggttcccatc taccatgtcc 


acaggggaga 


cagacgggaa 


gcacatgagg 


2940 


ggtgtgttta cagagttgta ggagttatgt 


agttctcttg 


ttgccttgga 


aatcactgtt 


3000 
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gttttaagac tgttgaaccc gtgtgtttgg ctgggctgtg agttacatga agaaactgca 3060 

aactagcata tgcagacaaa gctcacagac taggcgtaaa tggaggaaaa tggaccaaaa 3120 

taaggcaggg tgacacataa accttgggct tcggagaaaa ctaagggtgg agatgaacta 3180 

taatcacctg aatacaatgt aagagtgcaa taagtgtgct tattctaagc tgtgaacttc 3240 

ttttaaatca ttcctttcta atacatttat gtatgttcca ttgctgacta aaaccagcta 3300 

tgagaacata tgccttttta ttcatgttaa ctaccagttt aagtggctaa ccttaatgtc 3360 

ttatttatct tcattttgta ttagtttaca taccaggtat gtgtgtgtgc tgtactcttc 3420 

ttccctttat ttgaaaacac ttttcactgg gtcatctcct tggccattcc acaacacaac 3480 

tttggtttgg ctttcaatgt caccttattt gatggcctgt gtcccagtag cagaatttat 3540 

ggtattccca ttgctggctg ctcttccgac cctttgcttc tacagcactt gtctctccta 3600 

agatagtcag aaactaactg atcaggggat ggacttcacc attcatcgtg tctcttcaat 3660 

tctattaaat agaccactct tgggctttag accaggaaaa aggagacagc tctagccatc 3720 

taccaagcct caccctaaaa ggtcacccgt acttcttggt ctgaggacaa gtctccactc 3780 

cagtaaggga gaggggagga aatgcttcct gtttgaaatg cagtgaattc ctatggctcc 3840 

tgtttcacca cccgcaccta tggcaaccca tatacattcc tcttgtctgt aactgccaaa 3900 

ggttgggttt atgtcacttc agttccactc aagcattgaa aaggttctca tggagtctgg 3960 

ggtgtgccca gtgaaaagat ggggactttt tcattatcca cagacctctc tatacctgct 4020 

ttgcaaaaat tataatggag taactatttt taaagcttat ttttcaattc ataagaaaaa 4080 

gacatttatt ttcaatcaaa tggatgatgt ctcttatccc ttatccctca atgtttgctt 4140 

gaattttgtt tgttccctat acctactccc taattcttta gttccttcct gctcaggtcc 4200 

cttcatttgt actttggagt ttttctcatg taaatttgta taatggaaaa tattgttcag 4260 

tttggataga aagcatggag aaataaataa aaaaagatag ctgg 4304 

<210> 104 
<211> 3673 
<212> DNA 
<213> Mus mus cuius 

<400> 104 

tgcctcctgt actaatgtgt ttatggctat ttcccacctt ctcttctatg caatttattg 60 

tatatggctt tatgttgaag tccttcatcc actttgactt gaaaattgtg cagggtgata 12 0 
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aaaatggatc tgttttcttt ttttctgcat gcagacatca gttaagtcag tatcatttgt 180 
tgaagatgtt ttcatttgtc cattgtatgg ttttgtcttc catgtcaaat tcaagagtcc 240 
ataagtgtgt gggtttattt tggggtcttc aattttattt cattgatcaa catatctgtt 300 
tctgaaacca attccatgca gtttttatca ctatttctct gtattacagt gtgaggtcag 360 
ggatgatgat tgcttcacaa gtttttattt atttgtttgt ttgtttttgt ttgtttattt 420 
tcgtttgttt tgattagaat tgtttgactc tcctgtgagc tttttttttt tttttttaat 480 
ttccatgtga agttgagaat ggctctaaac atttgtgttg gagttttgct gaggattgca 540 
ttgaatctgt ctcttgcttt tggtaagcaa tctactatgt tccttctact aatctaggag 600 
catgggaggt ctttccatat tctgataact ccttcaattt ctttcttcaa agacttgaag 660 
ttatttgtca tagagttctt tcacttactt ggttagagtt acaccaatat attttgtatt 720 
acctgtggct attgtgaagg gtatcttttt ccctaatttc tttctctgcc cctctatcat 780 
ttgtataaag gagtagaact gttttctatg atttaatttt gtatctaaca acgtcttgtg 840 
agccattctt gggctcagct ccagtgcacc agagtattcc ttgctaaaga tggcatttcc 900 
agtgcttcct gacttttagc tttggagtta gctccacact actttgatct cttctcaagc 960 

ctttctttgg ttttgttttt agcagaactt agttctttcc atgtgcagat ctgtctatat 1020 

gtagcaaata aaatcctgga tggattattc tttagttttt agcagttgat ttttgttttc 1080 

ttttcatctt aatttctttc atgtaactct gatccattat ttttagaaac tttgcaaaat 1140 

actctatttt gttcatttgc aatcacaaat tattgatatt gcttataata gtcagttgta 12 00 

ctgggaggag aagcttacat attaaaatat actaaaatat tttatcatta aattttatca 1260 

tacacaacac acgtgcgtga gcacgcacac acacgcacac acacacacac acacacacac 1320 

acacacacac acacacctat tgaggtaaga tgtcataatt ttcttggtgt tcactgtata 1380 

ggcaggacta gctttggctt cttatcctcc ttcaatcttc tgagtagttg gaactaaagg 1440 

tgtgtacctc caagccatgt tatattgctt caattttctt tcccactgtt actagtgaat 15 00 

agagactttg ttgtccattt cattctttaa attttgagga attaataagg atttcctttt 1560 

atttgaaaca aattaattca acagtttctc atttttctta gagaaaaaca agcccttact 1620 

gtctggcttc tctgctggca ccctatctcc tcagcttctg tctcccacac aaatcctgtt 1680 

gtagggaagt agactttgta tttcctgtga tgcagtttaa tctctcattt tactcagatt 1740 
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ctttctgaat atattccagt ttagcagccc acactaaacc tggaatttat gctttgactt 1800 
cctcaccaga gatgaaatgc ctttaggaat cagttcagga aacaacattc tggtaggaag 1860 
tgtgaaaagg ttcatcaatt tctaccaaaa gagaatggaa gacatgctgg tgttgaccag 1920 
ccttgtatca tatgttcata tgacttttca aaaacagtat tttctttttg gttgttcttg 1980 
tcttccctaa gggagtcccc agaagatgag ttatgtttgg ctattagcag ttcagaattc 2040 
ataatatttt attagaatat catctgaagt aatttttgca caagtagtta aagtaagtgc 2100 
ttttataaga ccagcgagca tcttgtgcag gaatattaca gctctcttcc cacagattct 2160 

attggcaatc ttctgttttg tgaggaaatg catgatatat taagcttttt gtgagtttta 2220 

aaaacctctt atttcctact tagcacattt acatgatatc ataaaaagtg gatttttttt 2280 

tctgatacat gaattaccat ttagatgtat gcaaagctaa ttaccacact tccttcctgc 2340 

agggatttgg agtgtcaggg tggagaatag accattactg agatttcttt tcttttgttt 2400 

tattttttac atggctaaag atgccggtct aagaagatat ttctatagtc ctttttagga 2460 

tacttctctt gatagtatgt tgcagttggc tttgaagtaa cttatcctga ggaacaatct 2520 

gtacagagcc atggaagaca tccctaattc cctgaataga agagaaagca aacaaaggaa 2580 

aacaaaactt cttcaccctc tgcttttcta taaatttgtc tttgttatta gtccatcttt 2640 

gtgattgtca ggctatgaag actgtgacta tcatactgtt gtttggcagc aagtatctaa 2 700 

attataagaa cagaaacaag ggatatgcca gactccaatg tttgtagcca gttatttctt 2760 

ggcatttgtt aaaatatatt ttgcacatac aagatccaaa catactccca tcaaaaccgt 2820 

gcttgatgtc atataagtct gctaaatcct aattactatg ttttctaact agattcatgt 2880 

gcttattgtt ccatccataa gctttataat aagtgttgtt caaaacattg ggttccatat 2940 

gtgtatttgg ttatgtactt tgaccataca aattacttca acatttaaaa caaatataca 3000 

aataaataaa acaaaatcta cttacttcca aacatgagat ctgatatgtt gtgattaata 3060 

tccatagtat ttttccattg gaaagaattg attttccctt ctgcagcata tataagtgac 312 0 

agttcacttg ttaaccttta cccacatggc tagggtttga ttctttgaaa atataacaaa 3180 

ataaaatata agacaaaaca aaaactatca agtagaagtt gggcaagaca aaacaatagt 3240 

agaaaaagag cccaagagaa accataaggg tcagagtata ttgattcata tactcagtaa 3300 

tctcataaga ataataaatt ggaaatgaaa atatcaattc agagaaccaa gtgtggctct 3360 
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ttgaaagttc taagcaccat tcataagtct ctttgagctg atatcagctt tgatcatgtt 34^0 

gatgtatagg gccttgtatt attggcatcc tacattttct tgctactatt atttgcctct 348(?'. 

actgtttttt gctgcctcct cttccttgtg gttcacttag ctctgagagg agggatttga 3540 

tggagacatc catttacaac tgatagttcc aagttctcta attctctgtg tgatggctgt 3600 

ctacagatct ctaaatttgt tccaatctgc tataggagga agcttctctg atgacagctg 3660 

aacaagataa acc 3673 
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